AI in User Experience Research: What’s the Role of Synthetic Data?
In the last decade, synthetic data has been widely used in machine learning, computer vision, healthcare and autonomous driving.
But with artificial intelligence (AI) now permeating deeper into every industry, it’s time to consider how AI, and specifically the synthetic data it creates, should be utilized in user experience research (UXR).
Synthetic Data is Nothing New
In the medical domain, a shortage of annotated data poses a significant limitation in medical image processing, particularly for diagnosing rare diseases and ensuring data diversity that includes various ethnic groups. To address this data shortage and potential bias, synthetic datasets are often generated to provide an effective amount of diverse training data for medical applications (e.g., Bauer et al., 2021; Google Research, 2020).
For testing autonomous vehicles, a synthetic environment is created in which moving vehicles can be animated and rendered. This is achieved using a combination of computer graphics, physics-based modeling and robot motion-planning techniques (e.g., Intel’s CARLA and Microsoft’s AirSim).
Governments are increasingly leveraging synthetically simulated “crowds” based on the principles of physics, human psychology and behavior to enable comfortable distancing (especially during COVID-19) and to prevent potentially hazardous overcrowding situations.
The Pros and Cons of Synthetic Data in UXR
User experience research plays a vital role in contextualizing projects by uncovering user personas, their usage contexts and specific requirements. UXR often relies on various research methodologies such as surveys, interviews, diary studies and observations, which can be time-consuming and costly.
Leveraging the recent development of generative AI (GenAI) tools with the ability to produce human-like texts (e.g., ChatGPT) or images from textual descriptions (e.g., Midjourney), UX researchers have started to experiment to see if GenAI can facilitate research tasks, like creating interview guides and annotating qualitative data.
Since GenAI tools like ChatGPT can closely emulate human ways of speaking, UX researchers have started to wonder if large language models (LLMs) trained on trillions of text data collected from diverse domains over past decades can be used as tools to help simulate user interactions, preferences, needs and usage scenarios in various product development contexts. UX researchers also realize that the vast amounts of text data are typically from the internet and cover various topics — but that doesn’t mean the information qualifies as user research data.
In one of our research projects focused on luxury goods customers, we experimented with tools such as Synthetic Users, a platform for testing on AI “users,” and compared the interview outcomes against in-depth interviews conducted by UX researchers with actual luxury customers. We observed that the responses from synthetic users closely resembled human responses, covering topics like purchasing habits and brand loyalty requirements. However, the synthetic answers tended to be less concrete compared to real responses. Genuine customers often provided specific, real-life examples to vividly describe their exceptional luxury experiences, while synthetic responses were more generalized. Nevertheless, we found that employing synthetic users can serve as a starting point for the innovation funnel, ideation, pilot research and as an option to validate our interview questions prior to investing time and effort in conducting large-scale real user research.
Debates have been ongoing, not only within our EPAM Global Research Institute but also across the wider UXR community over the past year. In a comprehensive study, Hämäläinen et al., (2023) compared responses to open questions from humans and LLMs. They highlighted that synthetic data holds promise for generating ideas and conducting pilot experiments. However, they stressed the need to validate AI outputs with real data. Furthermore, Hämäläinen et al. raised concerns about the potential misuse of LLMs by malicious users of crowdsourcing services.
Eismann (2023), while not entirely opposed to synthetic UXR, noted that ChatGPT excels at summarizing key findings from interview transcripts but struggles with verifying or falsifying assumptions. Instead, it tends to “hallucinate” new assumptions not present in the original transcripts.
Ward (2023) posted a synthetic UXR scenario on social media, collecting thoughtful responses from the online audience, including experienced UX researchers and product development leaders. One point raised by Ward’s audience is that, as UX researchers, we all recognize that what users say does not always align with what they actually do.
That’s why researchers developed many methods to look into people’s latent experiences (Sanders & Stappers, 2013), which are not easily replicable by AI.
Another point of discussion related to Ward’s post revolves around the data used to train the algorithm. AI can only work within the data it has been trained on, whereas humans excel at extrapolating — making predictions beyond known data ranges. We achieve this by leveraging our creativity, intuition, domain knowledge, multimodal grounding, empathy skills and the social and interactive aspects of sensory input (Li et al., 2023). These unique human capabilities are challenging for AI to replicate.
On the other hand, synthetic data has long been employed in various scientific research areas. In machine learning (ML), for instance, synthetic training data is generated to encompass a spectrum of subgroups, such as diverse skin types and demographics, to ensure the robust generalization of ML models.
We acknowledge that the current synthetic UXR tools may predominantly yield generalized and averaged outcomes, potentially overlooking “outliers,” which can be particularly significant in specific situations, such as the accurate diagnosis of rare medical conditions. To avoid overlooking important “outliers,” researchers are deliberately incorporating synthetic patient data related to rare diseases into the ML model training process to enhance the accuracy of diagnosis algorithms.
Synthetic UXR is One Tool, Not a Replacement for Real Users
Clearly, synthetic users should not replace real customers and users in research. Instead, we view them as a potent tool that can enhance researchers’ efficiency in piloting studies and sharpening research questions. Data collected from user studies with small samples are inherently biased, as it is challenging to represent a large population with diverse ethnic backgrounds and habits adequately. Synthetic UXR can potentially open doors to explore new dimensions of user behavior and preferences that may be overlooked in small-sample user research.
Another reason to embrace synthetic UXR is to prepare us well for the upcoming AI tsunami. By actively experimenting with the tools, we, as researchers, can ensure we validate the AI research outputs with real user data and make the algorithms behind these AI models explainable, not leaving us outside of the algorithm “blackbox,” wondering what’s happening inside there. While utilizing synthetic UXR tools, it is important to educate everyone that we cannot create a product that works exceptionally well in the real world without some real user data.
In our next article, we will delve into our suggested approaches of applying synthetic UXR to facilitate user research.