The Ethics of Editing: How Much Should You Clean Up Your Participants’ Words?
In qualitative research, transcription is rarely a neutral act. Turning spoken words into text is an act of interpretation, one that raises uncomfortable questions. Should we preserve every pause, repetition, and grammatical slip? Or smooth things out so participants sound more articulate, more readable... more like what we wish they had said?
Editing participant quotes is often treated as a necessary evil. We trim the mess to focus on “key insights” or “representative themes.” But with tools like Leximancer, the rules are starting to shift - and the pressure to clean up may not be necessary at all.
The Case for Verbatim - and Its Limitations
Traditional verbatim transcription has long been a gold standard in qualitative research, especially in disciplines like discourse analysis or grounded theory. By capturing every “um,” stutter, and self-correction, we get closer to the natural rhythm of human speech. Which can, in itself, reveal meaning.
But anyone who’s analysed interviews before knows that verbatim transcripts are hard to read, hard to code, and time-consuming to analyse. They can also risk making participants (especially those who aren’t native speakers or who speak informally) appear less knowledgeable or composed than they are.
Many researchers end up “cleaning” the data to avoid misrepresentation. But that raises its own questions: how much are we sanitising not just the speech, but the complexity of lived experience?
The Problem with Trimming “Just the Key Quotes”
A common justification for heavy editing is that we only need to include quotes that “serve a purpose”: to support a theme, to illustrate a point, or to add nuance to an interpretation.
But what if the issue isn’t the quotes, it’s the method?
When you’re relying on hand coding and manual theming, it makes sense to cherry-pick a few quotes that reinforce your analysis. It’s efficient. It’s focused. But it’s also selective. And that selectivity introduces bias.
You’re then not interpreting data. You’re filtering it.
With Leximancer, You Don’t Have to Choose
Leximancer approaches qualitative analysis differently. Rather than requiring you to identify and extract key quotes manually, it processes all of the text. Messy, meandering, and authentic. And builds a conceptual map based on the actual language used.
This means:
You don’t need to decide which bits to include.
You don’t need to clean up first “for clarity.”
You don’t risk skewing your analysis by trimming what doesn’t fit your expectations.
The tool doesn't rely on your preconceived coding scheme. It surfaces concepts based on word associations and frequency across the entire dataset, giving you an emergent, unbiased map of what your participants actually said. Not just what you remembered or flagged.
The Ethical Case for Leaving It All In
When we use tools like Leximancer that allow us to include the full dataset… unedited, unfiltered… we’re making a powerful statement. Everyone’s voice matters, not just the ones that are easy to quote.
We don’t need to smooth out every sentence or edit for grammar. The analysis is done at the conceptual level, not the stylistic one. And because the tool is indifferent to eloquence, all participants have an equal footing.
That’s particularly important in research involving marginalised or multilingual populations, where traditional editing choices can unconsciously perpetuate bias.
So… Should You Still Clean Up?
If you're quoting participants directly in your write-up, a little light editing for readability can be appropriate - as long as you’re transparent about it. But when it comes to your analysis, tools like Leximancer free you from having to choose what’s worth including.
You can include everything.
No need to agonise over whether a quote supports a theme. Leximancer will show you what’s important based on the actual language patterns in your dataset - not just what catches your eye.
This new age of machine-supported qualitative research is the time to question some of the long-held assumptions about data cleaning, editing, and selection. What used to be a necessary compromise is no longer required - not when tools exist that can handle complexity without needing it to be tidied up first.
With Leximancer, the full messiness of human speech becomes a strength, not a problem.
Curious to see how Leximancer handles your real-world transcripts - unedited?
Get in touch for a trial or walkthrough, and see what your participants are really saying… without trimming a word.