Conducting Comparative Usability Tests

Macadamian Technologies | May 4, 2016 | 5 Min Read

Additional variables must be controlled for in comparative testing that generally make it more challenging than traditional usability testing. These variables span planning, execution, and test analysis.

Comparative usability testing provides product management, research, design, and development teams with a wealth of data and insights into how a product sizes up to its competitors. Since baseline performance metrics and comparative data are used to claim product success, it is important to collect and analyze the data accurately.

But the additional variables that must be controlled for in comparative testing generally make it more challenging than traditional usability testing.

These variables span planning, execution, and test analysis, adding complexity to the following activities:

  • Defining and recruiting test participants
  • Training user experience researchers
  • Creating the test environments
  • Training usability test participants
  • Executing test sessions
  • Analyzing test data

The Planning Stage

In traditional single-product usability testing, recruiting the right participants efficiently can be a challenge. For a comparative usability test that focuses on evaluating two or more products, participant selection can be even more difficult because there are additional variables to consider, including prior experience, brand and product attitude, and domain skills and frequency of using those skills. It is often useful to balance these variables across participants.

Instilling testers with the insights of experienced end users of the products under evaluation can help them to better understand typical workflows, features, terminology, etc. This will help the researchers to recognize when test participants go off an ideal path or make an irreversible error while trying to complete a task.

For in-person usability test that involve evaluating products from multiple vendors, a neutral test location can help make participants feel more comfortable providing honest opinions and feedback. Tests can be long and tiring, so consider including breaks with refreshments between tests to keep participants fresh and alert.

Clients or other stakeholders can gain useful insights by observing the usability test sessions, but ensure that participants remain unaware of any direct connection that observers may have to any of the products under evaluation.

The Execution Stage

During the execution stage, decisions about participant training that need to be made are:

  • the purpose of the training,
  • what skills and knowledge the participants should learn, and
  • how the training will be conducted.

Training should be formal, structured, and provided consistently to all participants, helping to ensure that every test participant begins their usability test session with the same skills and exposure to the systems or products being tested.

Executing a successful test requires careful selection of the product order, types of test tasks, task phrasing, lab space, wrap-up content, and usability metrics.

  • Variables that can be affected by the product order include practice or learning effect as users get “warmed up” and/ or become fatigued.
  • Phrasing tasks carefully helps to convey the same message and a clear goal for participants with the same baseline understanding for each task. Participants can take instructions very literally so it is best to avoid using slang or product-specific language.
  • As in traditional usability tests, the space should provide a realistic environment and there should be ample time to conduct a meaningful test wrap-up session. For comparative usability tests, it is also important to keep the environment as consistent as possible from test session-to-test session and product-to-product so that the environment does not skew test result data.
  • For a comparative test, the collection of quantitative data is strongly recommended as it allows direct product comparisons to be made and statistical significance calculated. Well-presented quantitative results can be very meaningful, easy for stakeholders to understand, and straightforward to market and promote. For example, completion rates are both easy to collect and provide a simple metric of success and system effectiveness.

Test Analysis Stage

The type of analysis performed on comparative usability test results depends on the data collected and who was involved in the test. Participants can attempt similar tasks on all tests (within-subject testing), or different sets of users can evaluate each product (between-subject testing). These different methodologies influence the calculations made to determine if any differences are statistically significant.

Within-Subjects Testing

Since the same group of participants evaluates all products in the in-subject test, a paired t-test can be used to assess the results. This analysis uses the mean and standard deviation of the differences between data points rather than the raw scores. This type of test is more efficient since participants act as their own control and the difference scores eliminate a great amount of variability relative to the raw scores. Another advantage is its statistical power since it effectively includes more subjects than between-subjects analysis.

Between-Subjects Testing

Since there are independent groups of participants each evaluating one product in the between-subject test, the situation could be much more complex for analysis. Depending on how the design has been used, the analysis may use independent-measures t-tests or be an analysis of variance. This type of study often requires a large number of participants to generate any useful and analyzable data. Other issues include different sample sizes and that researchers need to add a new group for every treatment and manipulation to the study. It is also impossible to maintain homogeneity across the group, as this design uses individuals with subtle differences, and this can skew the data.

When calculating test results, include confidence intervals around any means, compare results to specific benchmarks or goals, and compare results to the other products to determine if a significant difference exists.

Summing Up

As you can see, comparative testing introduces new variables that must be anticipated and controlled for. Taking the time to develop a comprehensive methodology that accounts for these complexities, executing it carefully to get the right data, and performing the appropriate analytics are keys to successful product analysis.

Get Email Updates

Get updates and be the first to know when we publish new blog posts, whitepapers, guides, webinars and more!

Suggested Stories

Structuring Multidisciplinary Software Teams

5 strategies we've learned from working with the biggest names in software for structuring multidisciplinary software teams to get amazing software out the door fast.

Read More

Guide to Creating Engaging Digital Health Software

This guide shares our knowledge and insights from years of designing and developing software for the healthcare space. Focusing on your user, choosing the right technology, and the regulatory environment you face will play a critical role in the success of your application.

Read More

10 Product Management Research Pitfalls

These are the 10 most common research pitfalls we see product managers make when gathering customer data. Avoid these to gain insights and design better software.

Read More
Macadamian has been acquired by Emids 🎉
This is default text for notification bar