TV Interface testing with Chalkmark

8 min read Guest Writer: Gianna Lapin

Testing websites and intranets is a common practice when using the Optimal Workshop tools. But as you may know, we can apply these tools to places outside desktop computers, tablets and phones. To show you how we can expand the testing canvas into other mediums, I decided to use Chalkmark (first-click testing tool), to compare the TV Interfaces of three major Over-The-Top (OTT) video content providers: Netflix, VUDU and  Hulu Plus.

Netflix and Hulu have enjoyed tremendous user adoption in the last few years with recent increases in consumer bandwidth and availability of set-top boxes like Roku and smart televisions like AppleTV. At the close of 2012, over 32 million US households subscribed to at least one OTT service.

For users who are accessing content through their TV (instead of on a mobile device, for example), these interfaces can be a challenge to operate. One important component is the functionality and form factor of the remote control, which strongly impacts the experience of finding and watching movies. Instead of a highly-responsive mouse and cursor, which a user can move around the screen at will, the remote control requires users to navigate to one hotspot at a time by pressing arrow keys and an “OK” or “Select” button. The overall experience is very much like the old text-based browsers of the mid-1980’s.

Both Hulu Plus and Netflix charge $7.99 USD a month to access their respective libraries of video content, essentially offering an “all you can eat” experience to subscribers. VUDU’s model is more “a la carte,” operating much like a video rental business where you pay a small fee to watch a title within a specific timeframe (30 days until viewing begins, then it must be finished within 24 hours). VUDU also offers redemption of UltraViolet digital codes, available with many packaged DVD sets in stores; this provides the user with permanent ownership of a digital copy of the DVD, stored in VUDU’s cloud.

Study Design

In order to compare apples to apples as much as possible, I decided I wanted to focus my study on just two key screens for each service: the high-level introductory screen, or what is seen immediately after the viewer launches the application (the “Browse” screen), and the screen that is seen when a user selects a specific title (the “Details” screen). The screens are as they appear on the Roku 3 set-top box, again for consistency’s sake.

Since the screenshots I gathered from the web varied so much, I decided to render each screen as a low-detail wireframe. This takes advantage of Chalkmark’s key strength – you can test any kind of image, no matter how low-fidelity. Since Chalkmark isn’t truly interactive, it’s extremely simple to sketch out an idea and test it without having to worry about how the interface responds to the user. As I worked up the wireframe for each service, I also removed obvious branding to anonymize the services as much as possible.

Hulu Plus Browse

Hulu Plus Details

Netflix Browse

Netflix Details

VUDU Browse

VUDU Details

Three different surveys were created, one for each OTT service, and respondents were recruited using Mechanical Turk. Each survey collected approximately thirty participants. The same twelve questions were asked in random order for each participant, and responders weren’t allowed to skip questions.

Demographic questions

Like the other Optimal Workshop products, Chalkmark gives researchers the ability to ask pre- or post-survey questions. I wanted to find out what kind of experience (if any) responders had OTT services, as well as set-top boxes and Smart TVs. I reasoned that experience these products might influence the participants’ answers, so by asking these questions I would be able to filter the results and compare experienced participants against unexperienced participants.

We can see that Netflix is the most popular OTT service, followed by Amazon Prime and then by Hulu Plus. I was surprised that nobody said they had experience with VUDU. However, a whitepaper by Cisco says that there are more than twenty different OTT content providers available to consumers in the USA, so the market is quite fractured.


Right away we can compare the time taken for each test. Chalkmark tests should be very quick for responders to complete, since it is recording a user’s first reaction. We can see here that there is not a significant difference between the three tests.

Hulu Plus:



Click here…

Users really rely on feedback to their actions, and since that is notably absent when simply viewing an interface, it’s no surprise that about a third of users acted like this interface should behave like a web page, with key elements hyper-linked to more content. The results for this question indicate that participants thought they could click on the title of the show that was visible on the screen, probably expecting to launch a video player. This was true for all three OTT services. Hulu’s interface fared much worse than the other subscription-based service Netflix, probably because users didn’t understand that the “Body of Proof” image was simply part of Hulu’s rotating carrousel advertising featured titles. You actually have to select “TV” in order to find “Body of Proof” episodes to watch, a fact that escaped over 70% of respondents.

Figure 1. Results for the task “You want to watch [selected title] right away.”

Netflix Browse Screen

Hulu Plus Browse Screen

VUDU Details screen

When in doubt, go Home

Participants tended to use the Home button on the remote control whenever there was doubt about what to do next. This kind of behavior can be compared to clicking “Home” or the logo on a web page, believing that it will take you back to the topmost page on a website. Jared Spool, founder of UIE, claims that hitting the Back or Home button on a website indicates task failure, and I’m inclined to apply the same logic to these interfaces. A notable exception is VUDU, where a button called “Similar” sounded like a good choice for nearly half of responders.

Figure 2. Results for the task “You would like to find a different TV series to watch.”

Netflix Details Screen

Hulu Plus Details Screen

VUDU Details Screen

Variable success when playing movies

One of the most important tasks that a user of an OTT service must be able to perform is to actually watch a video. To test this, I asked essentially the same question twice, and showed the participants a different screen for each one. Responders struggled with deciding on an action when faced with the Browse screen; this is seen especially with Hulu, whose carousel again confounded over 70% of participants. Once they saw the Details screen people generally figured out what to do, with the exception of VUDU. Responders didn’t quite understand that they needed to pay money to rent or own the title, and as a result responses for the question “You would like to watch [selected title] right away” were all over the place.

Figure 3. Results for the task “You want to watch [selected title] right away”. Browse screens vs. detail screens.

Netflix Browse Screen

Netflix Details Screen

VUDU Browse Screen

VUDU Details Screen


Hulu Plus Browse Screen

Hulu Plus Details Screen

Searching isn’t easy

All three services offered a selection of recommended titles – usually based on past viewing history – but if a viewer was in the mood for a specific kind of movie, most likely he or she would need to navigate through several screens in order to find a place to search. VUDU again stands out as the exception; it offers a filter control right on the Browse screen, which is intuitive and easy for the participants to recognize.

Figure 4. Results for the task “You would like to watch an Action movie.”

Netflix Browse Screen

Hulu Plus Browse Screen

VUDU Browse Screen

Prior experience doesn’t guarantee proficiency

I mentioned earlier that I asked what OTT services participants had experience with as part of this project. I looked at the results from participants who claimed experience with Netflix and Hulu Plus (VUDU was excluded, as nobody said they were familiar with it.)

Participants who had previous experience with Netflix actually performed worse, assuming they could just click the actor’s name in the program description at a higher rate than the participant group did as a whole. On the contrary, responders familiar with Hulu Plus were more likely to believe they had to search to find the answer.

Figure 5. Results for the task “You want to find other shows that [name] acted in.”

Netflix Browse Screen (with prior experience)

Hulu Plus Browse Screen (with prior experience)

Browse Screen (entire group)

Netflix Browse Screen (entire group)

Cheers Chalkmark

As we can see, Chalkmark can generate tons of useful insight into how people interact with interfaces, even ones that aren’t web-based. The ability to filter participants according to how they responded to pre- and post-survey questions is extremely valuable, allowing researchers to study results that match a specific criteria. Since Chalkmark works with flat, static images, anything can be an interface, from wireframes to screenshots to photographs – so the time you would have spent building a functional prototype can be used to analyze results and making the next round of enhancements. Chalkmark, and the other Optimal Workshop products, truly lay to rest the old claim that usability testing takes too long and is too expensive.


Gianna Lapin

Gianna is the senior UI/UX designer for a leading medical institution’s intranet team. She leads large design projects for clinical and operational departments, and helps author and implement enterprise-wide standards for web-based communication. She also designs and conducts user research studies and evangelizes for the human side of human-computer interaction.

Twitter:  @giannalapin

Want more website reviews? Check out our Mayo Clinic and WHO review.