14 April 2013

ConTest (meetup) - Security Testing

What is ConTest?
ConTest (link in Swedish) is a local test meetup in Malmö, Sweden started by Henrik Andersson (please correct me if I'm lying). Each meetup they have a theme and participants provide content by sharing short presentations/lightnings talks (approx 5 mins) that are followed by a facilitated discussion.

First Talk: Martin Thulin: Learning Security Testing
Martin, just like me, started exploring the world of security testing quite recently. In his talk he went through his top three resources for self-education in security testing:
  • Google security blog
    General information about online security
    Great cheat sheets, basic security information, list of security testing tools and much much more.
  • Hack this site
    Step by step hacking challenges used for practice/learning.
But maybe most importantly he shared the message:
"Anyone can become a hacker"

Great topic and great presentation!

In the discussion that followed tons of resources came up. I won't list them all here, instead I urge you to check out the #foocontest hashtag in Twitter..

Second Talk: Me: Quick Tests in Security Testing
Quick tests are cheap (quick, simple, low resource), general tests that usually indicate the existence, or potential existence, of a common problem or group or problems. Even though rarely proving the absence of a certain problem they can often be a good start to help you highlight common problems early (disclaimer: this definition/description of quick tests might not match with James Bach's and Michael Bolton's that is used e.g. in RST).

I spoke about two quick tests (for web) I've used:
  • Refresh
  • Cocktail string
Refresh means whenever I find a page that seems to make a lot of database calls, heavy calculations or connect to external servers (like an email or SMS server) I simply press and hold F5 (refresh) for a short while. What I'm looking for doing this is database errors, any changes in content/design, error messages in general and a finally fully context dependent stuff like received mails when the page is calling an email server or alarming patterns in logs.

In practice I've used this to take down a couple of databases (or rather connection to the databases). Simple and effective. Credits to Joel Rydén by the way who taught me this.

The idea with the Cocktail String is described in a separate post.

As a bonus I can provide you with a few other:

  • Scramble cookies
    Just quickly change the contents of cookies a page creates. Try both to generate errors by e.g. use strings where a number is set, use other plausible values (0 is always interesting, negative values as well) and combine with the cocktail string.
  • Back button
    When viewing sensitive data, log out and try pressing back. Is the data visible? Common and scary bug in some systems (systems expected to be used on any shared computer/handling very sensitive data), irrelevant in others (remember browsers deal with caching differently).
  • Bob
    When creating an account first try password "bob". It's so insecure very few systems should allow it (but does).
Final Talk: Sigurdur Birgisson: Security, Usability... huh?
I was a bit disappointed about Sigge's talk, not that it was bad (it was actually awesome) but because I was hoping he would share something really smart about how to deal with the often conflicting wishes of security and usability (like captcha). It also had very little to do with security testing, but who cares...

So what was it about? Sigge talked about how he thought the quality characteristics (CRUSSPIC STMPL mnemonic) were often interpreted as too abstract when talking to stakeholders. Instead he used the Software Quality Characteristics published on the Test Eye. What he did was he printed a card for each "sub characteristic" and, for best effect, tried to add matching examples from the product examined. The goals were both to get the cards (characteristics) prioritized to aid the testing focus, to support a healthier discussion of what the product needed to do as well as to make stakeholders care for them (it's not just about new features).

- It must be quick, quick is above everything else!
- What about data integrity?
When I said that, the customer started to hesitate
// Sigge

Henrik Andersson also shared an interesting story related to this presentation where he had gotten the top managers in a company to prioritize the original 8 quality characteristics (CRUSSPIC) in the context of a product and used this both when testing and when reporting status. Brilliant as well!

There are a lot more to say about this presentation and I might get back to it in the future. For now, just check out Sigges blog. Finally it made me think of ways to improve my own ongoing prioritization job with my product owner, for that I'm really grateful!

Summary of ConTest
Great people, great mix of people, great discussions, great presentations, great facility, great to meet Sigge before his Australia adventure, great to meet Henrik Andersson for the first time, great to try ConTest's format and great to get new insights about security testing. ConTest was a blast! Thank you Malmö, Foo Café and all the ConTest participants!

11 April 2013

Cocktail Strings - Quick Test for Web Security Testing

This is a part of my talk on ConTest this evening. I started blogging about the whole event but time is running out so I'll start with this and provide you with the rest tomorrow.

Cocktail String?
Cocktail String are a form of quick test that can be used in security testing. The idea is a general string with a mix of ending characters and other stuff that can be used for MySQL injections, XSS and similar. Here's the example I provided:


The initial single and double quotes are to screw up badly sanitized query strings or similar, the $ is to get an error in case the string is evaluated as PHP, %> is to end a PHP or HTML tag, the star I don't know what it's for but since it's often used as wildcard in various situations I figured it might provide something and the ending <!-- is my secret weapon as it safely, but with a cool effect, exposes possibility to execute user defined HTML (and thus potentially scripts like javascript).

Where to use?
The string, or variations of it, can be used wherever users can send data to a server like in forms (especially look for hidden form named "id" and similar), cookies, file uploads, HTTP requests etc.

What are you looking for?
Exactly what to look for depends on context but here are a few suggestions:
  • Any kind of errors like no database connection, faulty query errors, code errors etc.
  • Garbage being printed on the page
  • Any irregularities on the page (even subtle design changes like an extra line break)
  • Check HTML code for occurrences of the string
  • Check for errors in any logs
  • If used in combination with databases, check the database content
  • Check cookie content
Actual examples
I names my user this on a server application. Everything was fine until I started the administration interface, suddenly I couldn't see anything but a few garbage characters on a white background. The reason was the user's name wasn't sanitized on the admin pages but everywhere else.

Another similar example was when I used it as name for an item occurring in a log blanking out everything showing up after that log entry.

Finally I  used it on a friend's project to highlight MySQL injection possibilities.

Ways to improve it
Many great suggestions to improve the string came up during the meetup. One was to add unicode versions of certain characters since this for instance fools some sanitation functions built into PHP as well as other sanitation plugins. Another was to add international characters including stuff like Arabic and Japanese. Finally Mattias suggested using an iframe instead of the comment tag at the end since loading a scary page is even more striking than blanking out half the page (as well as actually proving a really destructive bug exists). Cred to a lot of people for all the great suggestions (especially Simon for unicode, I'll add that for tomorrow's testing!).

Finally notice you'll have to change this string to fit your context, for instance a string used on an ASP.NET application or a Java desktop application looks different.

One interesting comment that came up was automating this (which is typically what security scanners do already, by the way). First off, it's suited really well for random automation that just sends various combinations of dangerous characters into every field it finds and compares for instance HTML output to a control. Might miss stuff / give a lot of false positives but still a great way to work with it. It also addresses one of the weaknesses with the string; since it uses so many different commonly filtered out characters the string might get filtered out due to one character while a couple of the other would have worked in isolation (this is typically what I mean when saying quick tests rarely provide evidence that a certain problem can't exist), with the speed of automation (assuming you achieve that) you could test them more one by one as well as in more combinations.

So a bit of a messy post but long story short:
Instead of Bob, name your next test user '"$%>*<!-- and let it live in the lovely city of '"<!-- with interests in  '"$% and >*<.

Good night!

09 April 2013

EAST meetup - talking metrics

Yesterday's EAST meetup was focused on test metrics, starting with us watching Cem Kaner's talk from CAST 2012 and then continuing with an open discussion on the webinar and metrics in general.

Cem's talk
Cem talked about the problem to setup valid measurements for software testing and the work he presented on threats to validity in measurements seemed very interesting (you can read more about that in the slides). He also talked about problems with qualitative measurements:

  • Credibility, why should I believe you
  • Confirmability, if someone else analyzed this, would that person reach the same conclusion
  • Representativeness, does this accurately represent what happens in the big picture
  • Transferability, would this transfer to another similar scenarios/organizations

So far so good.

But Cem also, as I interpreted it, implied that all quantitative measurements we know of are crap but if a manager ask for a certain number you should provide it. In this case I agree we testers in general need to improve our ability to present information but, as I will come back to, I strongly disagree with providing bad metrics even after presenting the risks with them.

How do you measure test in your company
Everyone was asked: How do you measure test in your company. The most common answer was happy/sad/neutral smilies in green/red/yellow, which was quite interesting since it relates closely to emotions (at least as it's presented) rather than "data".

The meaning of the faces varied though:

  • Express progress, start sad and end happy (="done")
  • Express feelings so far, first two weeks were really messy so we use a sad face even if it looks more promising now
  • Express feelings going forward, good progress so far but we just found an area that seems really messy so sad face (estimate)
In most cases some quantitative measures were used as input but wasn't reported.

My personal favorite was an experiment Johan Åtting talked about, where a smiley, representing the "mood" so far (refers to item two in the list above), is put on a scale, representing perceived progress (see picture). Seemed like a straight forward and nice visual way to represent both progress and general gut feeling. The measurements of progress in this case were solely qualitative if I understood correctly but would work if you prefer a quantitative way of measuring it as well.

Also a couple of interesting stories, both from the great mind of Morgan. First was an example of a bad measurement with managers basing their bonuses on lead time for solved bug reports. This was fine as long as developers had to prioritize among incoming reports but when they improved their speed this measurement dropped like a stone (suddenly years old, previously down prioritized, bugs were fixed) and managers got upset.

The second story was about a company where developers and "quality/production" (incl. testers) were separated. Testers in this case tested the components sent from developers in isolation before going to production. However, when the components got assembled tons of problems arised, problems customers reported back to the developers without quality/production knowing it. This lead to a situation where quality/production couldn't understand why managers were upset with bad sales, the product was fine in their view. The situation improved when Morgan started to send the number of bugs reported from customers (bad, quantitative measurement) to the quality/production department.

An interesting twist was later when they tried to change the bad quantitative measurement with something more representative and met a lot of resistance since management had learned to trust this number. I asked him if he in retrospect would had done anything differently but we never got to an answer.

I shared a creative test leader's way of dealing with number. She reported certain information (like test case progress and bug count) but removed the actual numbers. So when presenting for upper management she simply had visual graphs to demonstrate what she was talking about. As far as I know this was well received from all parties.

Finally an interesting comment from Per saying: "Often I find it more useful to say; give us x hours and I can report our perceived status of the product after that".

Epiphanies (or rather interesting insights)
During the discussions I had a bunch of interesting insights.
  • Measurements are not a solution to trust issues!
  • Instead of saying "we have terrible speech quality" or "the product quality is simply too bad" we could let the receiver listen to the speech quality or demo the aspects of the product we find bad. It's a very hands on way to transfer information we've found.
  • Ethics is a huge issue. If we want someone to believe something we can (almost) always find compelling numbers (or analysis for that matter).
  • Measuring progress or making estimations will not change the quality of test/product's state after a set amount of time (just as a reminder of what you don't achieve and the cost of measurements).
  • If a "bad measurement" can help us reach a better state in general (like in Morgan's example), is it really a bad measurement? (touching on ethics again).
  • When adding qualitative measurements to strengthen our qualitative measurements, are we digging our own grave? (risk of communicating: So it's not my analysis/gut feeling that matters, it's the numbers)
  • How a metric is presented is often more important than the metric itself.
  • In some cases brought up the measurements weren't even missed when not reported anymore.
  • Don't ask what reports or data someone wants, ask what questions they want you to answer.
  • Cem talked about transfer problem for students, making it hard for them to understand how a social science study can relate to computer science for instance. I think the same problem occurs when we move testing results into a world steered by economics (numbers).
  • Even bad measurements might be useful to highlight underlying problems. Once again Morgans examples somewhat shows this and Per talked about how more warnings in their static code analysis was an indication programmers might be too stressed (if I interpreted it correctly). In these cases it's important to state what the measurement is for and that's it's just a potential indication.
Measurement tempering
We talked about how we easily get into bad habits/behavior when quantitative measurements becomes "the way" to determine test status.

Bad behavior when measuring test case progress:
  • Testers saving easy tests so they have something to execute when the progress is questioned
  • Testers running all simple tests first to avoid being questioned early on
  • Testers not reporting the actual progress to avoid putting too much pressure on them or to fake progress to calm people down.
  • Testers writing tons of small simple tests that typically test functionality individually which creates a lot of administrative overhead as well as risk of not testing how components work together, this to ensure a steady test case progress.
  • Test leaders/managers questioning testers when more test cases are added (screws up "progress"), as a result, testers ignore broadening test scope even when obviously flawed.
Bad behavior when measuring pass/fail ratio:
  • Ignoring bugs occurring during setup / tear down.
  • Slightly modifying a test case to make it pass (making it conform with actual result rather than checking if that's a correct behavior).
  • Slightly modifying a test case to make it pass (removing a failing check or "irrelevant part that fails")
We also talked about how the culture (not only related to test measurements) in various countries affected testing. One question was: Why is CDT so popular in Sweden. Among the answers was low power distance (Geert Hofstede, cred to Johan Åtting), Law of Jante and that we're not measured that much in general (i.e. late grading in schools, not grading in numbers etc.).

We also talked about drawbacks with these cultural behavior (like hard to get to decisions since everyone should get involved/agree).

Finally, and mostly, we talked about how our view on testing and measuring sometimes collides with other cultures, with actual examples from India, Poland and the US. This discussion was a bit fragmented but feel free to grab me if you want to hear about it.

This was a great EAST meetup and I really feel sorry for the guys I know like this topic but couldn't attend. Definitely a topic I hope (and think) we'll get back to!

Finally a lovely quote from Magnus regarding a private investigator determining time to solve a crime:
"Let's see, there are 7 drops of blod, this will take 3 weeks to solve".

Good night! .)