The Future of Survey Analysis: When Respondents Use ChatGPT to Answer Open-Ended Questions
Imagine you’ve just finished collecting data from a survey with hundreds of open-ended responses, eager to dive into the insights. But as you begin analysing the results, something feels off. The answers sound eerily similar, with the same phrasing, tone, and structure popping up across the board. You dig deeper and realise—many of your respondents have used a large language model (LLM) like ChatGPT to answer the questions. What happens next? How does this affect your data, and what does this mean for the future of survey analysis?
This very scenario unfolded after a meeting between Dr. Andrew Smith and me, with a gentleman who had been analysing a large data set from a global survey. About 30% of his respondents had clearly relied on an LLM to respond, and suddenly, the validity of the data came into question. Most of the survey participants were from diverse locations, and English wasn’t always their native language. Some used the LLM to translate; others answered directly through it. This discovery opened a whole new conversation about framing bias, response quality, and the future of open-ended question analysis.
The Problem with LLM Responses: Framing Bias
When a respondent uses an LLM like ChatGPT to answer a survey question, how they frame the prompt dramatically affects the output. Imagine they simply paste the survey question directly into ChatGPT and hit enter. The model will generate responses that mirror the language and framing of the question itself. Every question comes with the consideration of framing bias but when fed to an LLM, this increases to a point where even respondent interpretation is removed. Respondents' answers, when left to LLMs, are unintentionally shaped by how the question is worded rather than reflecting their own thoughts or opinions.
The result? The responses can become homogenised—everyone’s using the same words, sentences, and ideas. This can skew results significantly, especially if the survey is designed to capture nuanced, personal reflections. The diversity of thought that should emerge from an open-ended question suddenly fades, replaced by the consistency of a machine-generated answer.
Reframing the Question: A New Kind of Bias?
But what if a respondent takes a different approach? Instead of simply copying and pasting the survey question, they might reframe it when asking the LLM for help. This could introduce subtle differences in the responses, potentially enough to reflect the respondent’s own views or emotions.
For instance, if the survey question asks, “How do you feel about remote work?” the respondent might ask ChatGPT, “What are the advantages and challenges of working from home?” The resulting answer could be quite different, capturing some of the respondent’s intent while still influenced by the LLM’s data.
Does this mean reframing helps eliminate bias? Not entirely. It may introduce new biases, depending on how the LLM interprets the rephrased question. Could it lead to richer, more personalised responses, or does it create a new layer of distortion? This uncertainty adds complexity to survey analysis, forcing researchers to consider how much weight they give to LLM-assisted responses.
What About Language Barriers? Is Translation Fine?
In our meeting, we also discussed the role of LLMs in overcoming language barriers. For respondents whose first language isn’t English, using an LLM like ChatGPT as a translator can help them articulate their answers more clearly. In this case, as long as the LLM is being used purely for translation, the responses could still reflect the respondents' true sentiments.
But even here, we must tread carefully. The translation algorithms used by LLMs are not perfect and can sometimes subtly alter the meaning of a response. Are researchers accounting for this when they analyse survey results? How can we ensure the sentiment remains accurate, especially when respondents rely on AI to bridge the language gap?
Co-Occurrence Tables: Detecting AI-Generated Patterns
One powerful tool used in analysing survey data is the co-occurrence table, which identifies patterns of word association in respondents’ answers. When many respondents use LLMs like ChatGPT, the tables can reveal strong word associations that aren’t typically found in human responses. Whether the question has been reframed or not, phrases like “on the other hand” or “in conclusion” may appear with unusual frequency, signalling that an LLM may be at play.
The ability to detect these patterns gives researchers a way to filter out—or at least account for—AI-generated responses. But it also raises a larger question: How much is too much AI involvement in surveys? At what point do we start questioning the validity of the data itself, and how do we balance the benefits of AI in accessibility and ease with the potential drawbacks?
Where Do We Go from Here?
The rise of LLMs like ChatGPT is only going to grow, which means the challenges of AI-assisted survey responses will persist. So, how do we navigate this evolving landscape?
For one, researchers must develop new ways of spotting and mitigating LLM-generated responses. Co-occurrence tables, sentiment analysis, and keyword checks will be essential tools in the fight to preserve data integrity. But beyond detection, we need a broader conversation about how much LLM assistance is acceptable in survey responses. Should surveys explicitly discourage the use of AI tools? Or, should researchers simply adapt to this new reality and focus on filtering and adjusting data accordingly?
Moreover, the ethics of AI in survey analysis must be considered. If we allow LLMs to become too involved in respondents’ answers, we risk losing the authentic human voice that surveys aim to capture. Meanwhile, ignoring AI’s growing presence could blindside researchers, leaving them with skewed data that no longer reflects reality.
A Future Full of Questions
AI is becoming deeply embedded in everyday life and the task of survey analysis will undoubtedly become more complex. LLMs like ChatGPT offer incredible convenience and accessibility, but they also present new challenges in maintaining the integrity of data. The way we approach open-ended survey questions must evolve alongside this technology, keeping in mind the biases and pitfalls that AI can introduce.
The future of survey analysis may well lie in how we adapt to these challenges. By developing new techniques to detect AI involvement, refining our understanding of framing bias, and considering the ethical implications, we can safeguard the value of human insight in surveys. As researchers, how do we find the right balance between leveraging AI’s benefits and mitigating its drawbacks? The answers to these questions will no doubt shape the future of survey research - I wonder who will answer them.