Can remote tree testing predict moderated results? A fascinating study by a UX expert

This article is published with permission from the author, and was originally published on neoinsight.com. When choosing online research tools, it helps to have confidence that the data produced by the tool will do a reasonable job of predicting what will be found using moderated usability testing — the gold standard of usability research. Recently, we were able to run an online tree-testing study for a site design, then run moderated usability sessions with people on a demo web site with an almost identical design. This article shares some of our results which show how closely the online test results predicted moderated results for participants on the live demo site. We used Treejack for this study.
Our aims for the study
The live site we studied is a single-purpose ‘service finder’ site, with little navigation other than selecting topics and subtopics that help people produce a set of services tailored to their needs. Over time, the web team had seen task success rates decline as the number and type of service topics grew out of control. The goal of this project was to create a new topic navigation design that brought success rates back up to levels established in previous usability tests. It’s rare to get a chance to test an almost identical design between tree-testing and a live site, because usually we make substantial improvements to the site, before development, based on what we learn in the online test phase. In this case though, we’d already been through several iterations of Treejack studies before finally hitting success rates that were worth the development expense.
High correlation for a simple visual design
In the chart below, you can see the success rates by task for the online Treejack study, where each task had 40-50 participants. Results for the remote moderated usability test of the same design show tasks across a set of 16 participants using a working demo of the site. The correlation coefficient across the 11 tasks for the two studies is .57. A few things to notice:
- Treejack predicted which tasks people would find most difficult
- Treejack also predicted which tasks people would do well on in the moderated testing
- Treejack success rates are lower than the moderated testing rates — on average, the success rates in the live usability test are 29% higher than the success rates in the Treejack study.
Why are some tree test success rates so much lower?
Good visual design helps users understand the purpose of what they are seeing.
First off, Treejack success scores are not always that much lower. Some tasks not shown above had very high success rates in the Treejack test- in the 90% range. We were so confident that they would be successful that they weren’t included in the moderated testing. It’s the visual design that is the important player in the difference in the scores. On this site, the visual design helped people see more of the structure and purpose of the topics than the Treejack participants saw. The design was very clean, with visual cues like indenting and lists, and no visual clutter obscuring the top tasks. The team had carefully selected one or two sub-topics to display as exemplars beneath each main topic. This helped moderated participants take a short-cut to the right sub-topic, and seeing the sub-topics also helped them better understand the intent of the topic itself. For example, in a Treejack study, you would only see top level topics like this):
But on a website, the participant might see one or two popular sub-topic links below the topics or perhaps you could hover over a topic to see it open out, without clicking, as shown below:
Treejack, therefore, is brutally strict compared to a good page design. And that’s ok, since getting those top level categories right is crucial to task success, particularly in mobile designs, where hover isn’t available. The point is that Treejack helps create designs with clear well-defined categories, where people are more likely to get that important first-click correct, and therefore are twice as likely to succeed , as shown in Bob Bailey’s First Click Usability Testing study.
A caveat — sometimes tree testing scores are higher
In a recent study of a different web site, a preliminary Treejack study was not as predictive of success with moderated participants. The Treejack participants selected a particular footer topic far more often than participants in moderated testing, skewing the Treejack results towards low success rates for many tasks where that topic was incorrect, and conversely, a higher success rate where it was were correct. In the Treejack view, participants could see the entire structure in a glance, including the footer items at the bottom of the list. Including footer-type options in a Treejack is a decision that should be made on a case by case basis — in this case we wanted to explore their influence. In contrast, the moderated participants saw a visual design of the page that separated the footer from the main navigation by several rows of tiled images. Participants usually didn’t scroll down past the first row of tiled images (as seen in many other studies, they thought they were ads) to see the footer item that had so dominated the Treejack results. They were more successful at many of the tasks by following this strategy. For tasks involving other footer topics, they were less successful. Using both Treejack and moderated testing methods in this set of studies highlighted the influences of both the language and the colors, the placement and the imagery, on the site design.
“Design is not just what it looks like and feels like. Design is how it works.” – Steve Jobs
Tree testing – part of a balanced UX design process
Treejack testing is a fundamental part of a navigation design process but it’s not a replacement for moderated testing except in cases where:
- the visual design is supportive of and very consistent with the Treejack presentation
- the design change is solely to the navigation topics on a site that is well-understood through previous moderated testing
- the entire purpose of the site or application is to produce results based on the topic hierarchy (as is the case for the simple Finder site described in the first study).
Online iterative tree-testing is a useful research technique for any team as they progress towards a full design. In Lean/Agile processes, a study can go online within a few days, to generate results that guide the team towards a supportive visual design. User Experience researchers and design teams can feel confident that, keeping the visual design caveats in mind, Treejack tests will predict navigation success rates of tasks on the live site. And that’s pretty priceless.
Some excellent resources to take you deeper
- Blending moderated and online Treejack usability testing: Move people forward – new research techniques to improve navigation (Article)
- Usability testing as part of a ongoing process: Improving Cisco Support Website (Webinar)
- Using Treejack on a large government website: Auditing the IRS with Treejack (Article)
- Treejack won’t predict the impact of visual design, and that’s the beauty of it (Article)
- Using two techniques: Why card sorting loves tree testing (Article)