Monday, May 24, 2010

Acceptance Testing and the Testing Pyramid

For the past couple of months, I've been working with a client who is seeking to get the best value they can out of their testing automation efforts. One of the big opportunities I've seen for them to increase the value of their tests is to adhere to the "Testing Pyramid" -- the idea that you should have lots of unit tests, fewer integration tests, even fewer functional tests, and very few acceptance tests at the top of the pyramid.

But in the process of working this out with them, I've been noticing that how people define acceptance tests varies widely. And then I came across this interesting review of an article by Jim Shore on whether to automate acceptance tests at all:

http://www.infoq.com/news/2010/04/dont-automate-acceptance-tests

(the original article is here: http://jamesshore.com/Blog/The-Problems-With-Acceptance-Testing.html )

Having read the original article, I don't think the reviewer quite fairly represents Shore's position. Shore is not so much arguing against automating acceptance tests, as arguing that automating acceptance tests, by itself, doesn't buy the business value that he once thought it did.

Me? I think that the key value lies in defining "acceptance tests" as literally that--the automated reification of the Product Owner's Acceptance Criteria.

As any Agile developer knows, the five-to-seven Acceptance Criteria on a particular story represent only a fraction of the expected functionality--but they represent, ideally at least, the core business value and core expectation of the PO. By automating those expectations in a "visible execution" testing tool like Selenium, we gain the dual benefits of both securing those core expectations against regressions, and creating a built-in "known working demo" for the customer of the functionality the customer most desires. But what about the rest of the functionality, the part that the delivery team fills in, that isn't explicitly mandated by the acceptance criteria?

That's where you can (and should) drop down a level, into functional testing--headless browsers like HtmlUnit for web applications, and behind-the-UI testing tools like FIT for desktop applications. Less brittle than automated acceptance tests and considerably faster-running, they pay for these features by being opaque to the PO. But we've already written the PO's core expectations in a visible, user-comprehensible form.

The problem comes when people take "acceptance tests" to mean "system tests done through the UI", and then attempt to test their entire application via this type of test. I think that's what Shore is getting at when he says "plus a full set of business-facing TDD tests derived from the example-heavy design". I've worked on projects like this, when the legacy technology we were using (ColdFusion) gave us literally no entry points between the web page and the database. Yes, we were able to build a good application this way--but by the time we were done, the entire test suite took over seven hours to run, and we were only actually executing the "most relevant" roughly 10% of the test suite until the end of each Sprint. That's not really a recipe for good iterative development or TDD.

As long as "acceptance tests" are kept at the level (and scope) of the acceptance criteria, they become a powerful tool both for communicating with the PO and for securing the PO's core business value against regressions. Then, testers and developers are free to use all the fast, efficient, lower-level tests they want to provide test-driven design and a refactoring safety net.