Generated on Mid-Journey

Pay Attention to Reliability and Validity in UXR

Mani Pande
UXR-manipande
Published in
4 min readDec 10, 2023

--

When was the last time you paid attention to whether the scale that you had designed for a survey was reliable and valid, oftentimes in the hussle of delivering results fast UXRs don’t pay attention to these important concepts. These terms, far from being mere academic jargon, are vital tools in our UX toolkit, especially when UXRs run surveys. Let’s delve into these concepts: reliability, and three key types of validity: predictive, content, and construct validity. I will be illustrating them with practical examples from the UXR world.

Reliability

Reliability in UX research is about consistency. If we conduct a survey or a test multiple times under similar conditions, we should expect to get consistent results. For example, if we’re testing the efficiency of a new search feature on a website, reliability would mean that users consistently find what they’re looking for quickly across multiple tests.

Validity

While reliability is about consistency, validity is about accuracy. It’s about ensuring that your test is actually measuring what it’s supposed to measure. For example, when assessing the usability of a new app feature, a valid measure would accurately measure the usability across different dimensions like ease of use, usefulness, complexity, consistency etc.

The good news is that if you want to measure usability there are industry standard scales like Software Usability Scale (SUS) or Usability Measure and User Experience (UMUX) that have been found to be reliable and valid that researchers can use. I am a big believer in using scales that have been tried and tested rather than creating one from scratch to avoid common pitfalls associated with lack of reliability and validity.

The most basic types of validity are

  1. Predictive Validity

Predictive validity involves how well a test predicts future behavior or outcomes. In UX, this might look like using beta testing feedback to predict how well a new feature will be received by the wider user base.

2. Content Validity

Content validity is about whether a test comprehensively covers the concept it’s supposed to measure. Imagine creating a quiz to assess digital literacy. If the quiz only covers social media use but ignores other aspects like email, web browsing, and cybersecurity, then it lacks content validity. It’s not fully capturing the breadth of digital literacy.

3. Construct Validity

Construct validity is about ensuring that a test measures the concept it’s supposed to measure, based on the theoretical relationship between them. This is the one that UXRs should pay close attention to. Let’s go back to SUS as an overall measure of usability. If the theory suggests that higher ease of use should correlate with a higher SUS score, and our research indeed finds this correlation, we have evidence of construct validity. This type of validity ensures that our tests are not just measuring something, but the right thing.

One way to ensure that measures have high construct validity is to run a test-retest.

The test-retest method is a simple yet effective way to measure construct validity. Here’s how it works:

  • First, you administer your test or survey to a group of people.
  • After a certain period, you give the same test to the same group again.
  • Then, you compare the results from the two tests.

If the results are similar, it suggests that the test is measuring a stable construct. For example, if you’re measuring user satisfaction with an app feature, and users rate their satisfaction similarly on both occasions (provided no changes have been made to the experience), it indicates that your measurement is reliably capturing user satisfaction — a key aspect of construct validity.

Why These Matter in UX Research

In UX research, reliability and these three types of validity ensure that our findings are accurate and can be trusted to inform design decisions. They help us avoid common pitfalls like basing our decisions on flawed or incomplete data.

Challenges and Considerations

Achieving high reliability and validity can be complex. User behaviors and preferences can change, affecting the reliability of longitudinal studies. Ensuring validity, especially construct validity, requires a deep understanding of the underlying theories and concepts. Hypothesis based on previous research can be leveraged for this.

Conclusion

Reliability and validity, including predictive, content, and construct validity, are not just theoretical concepts; they are practical necessities in UX research. By understanding and applying these principles, we can create more effective, user-centered designs and strategies that truly resonate with our users.

In the words of scholars like Carmines and Zeller (1979), and Nunnally (1978), these principles are crucial for conducting research that is not only methodologically sound but also impactful and relevant in real-world settings.

References:

  1. Carmines, E. G., & Zeller, R. A. (1979). Reliability and Validity Assessment.
  2. Nunnally, J. C. (1978). Psychometric Theory.

P.S: This is a rewrite of a paper that I had written as part of my PhD work, and I have used examples from the world of UX to illustrate how these dense academic concepts are useful for UXRs.

--

--