What to do when your research participant is fraudulent in an online qualitative study? 6 scenarios and guidelines for HCI and UX researchers

Published in

Health and HCI

9 min readMay 8, 2024

Note: This article was written based on the findings from the research paper, “Understanding fraudulence in online qualitative studies: From the researcher’s perspective”. For a more comprehensive perspective on the topic and the full report of findings, please check out the paper here. The paper will be presented at the CHI 2024 conference on Mon, 13 May at 4:30pm in Honolulu, HI.

Consider this: You posted your research recruitment flyer on social media. You are soon elated to find that you received dozens of responses. You start your standard process of putting together a list of interesting potential participants by cleaning out spammer data and incomplete responses. You then email and schedule interviews from your list. Cut to — the actual interview — you are surprised by the recruited participant not being the demographic (age, gender, or race) they self-reported originally. You proceed with the interview as it could have been an unintentional mistake. However, you realize that the participant is not saying anything specific about the study context. There are inconsistencies, illogical statements being made, and the data overall doesn’t tell a coherent story. What is happening? Is your participant not real? How do you handle the situation fairly and ethically? What would you do with the collected data?

There have been cases of individuals faking their identity and experiences to gain access to research studies.

Today, a lot of qualitative studies are conducted online. The option of having a remote or virtual format has made it possible to reach more diverse audiences, make participation more accessible, and streamline data collection and analysis. Unfortunately however, the flexibility of being part of research from behind a computer screen, with no in-person logistics giving an additional layer of verifiability has also attracted the attention of individuals seeking to commit fraud. Apart from our own work focussed on HCI and UX researchers, a few notable cases of this in health and social sciences domains can be read here, here, here, and here.

What is “fraudulence” in research study contexts? What is the harm it causes?

Fraudulent activity or behavior, in research study contexts, is when respondents sign up for participation — either for personal gain (e.g., incentives, access to unreleased products, or not-yet available medical treatments), or to cause harm or disruption in a research study. This may involve such individuals completing their data inappropriately, exaggerating or faking their experiences, and possibly their identity and other characteristics.

Fraudulence, when it occurs, can cause harm to the ecosystem of research actors and activities. Here we discuss three of those harms:

The first harm is to the research study, where the integrity of the findings are questioned due to being informed by unverifiable data.
The second harm is to “real” or well-intentioned participants, who may find themselves in situations like focus groups where they are sharing confidential information next to a fraudulent party.
The third harm is to the researcher itself, who has to figure out what happened and proceed in a way that is respectful, fair, and protective of human subjects. Experiencing fraud may also lead to researchers feeling self-doubt over their ability to produce research artifacts and conduct research effectively.

Illustrated view of the harms caused by fraudulence with a few examples shown in bullet points

What are some scenarios of fraudulence and how can qualitative researchers be better prepared for it?

In our study, we interviewed 16 HCI qualitative researchers on their experiences encountering fraudulent participants. The below described scenarios are just a simplified synthesis of a portion of our study findings. For a more nuanced report, complete with researcher accounts and quotes, please see our full paper. Each scenario features a few suggested next steps and possible tensions or limitations to consider.

Scenario 1: Data Mismatch

A participant’s responses during a research session are different from what was self reported or collected during screening.

In this illustrated scenario, the participant introduces themselves and states their age. However, the researcher is confused because they were expecting a 16 year old as per the nature of their study and prior screening.

Suggested next steps:

Since one answer being different isn’t enough to assume that the participant is untrustworthy, the researcher could re-ask additional questions from screening and compare the two responses. For example, the researcher could later on ask, “Just to confirm — what year were you born?”. In the case of non-fraudulent participants, this would give them a chance to correct themselves. However, if the participant is fraudulent and makes multiple mistakes, it can serve as evidence of the issue.
If the researcher has reached a stage where it is clear that the participant is being deceptive: the researcher can end the session citing ineligibility or back out from the interview and then later get back to the participant via email.

Some things to keep in mind:

Abrupt ending of interviews can lead to hostile situations, such as the fraudulent participant becoming agitated or assertive about compensation. If there is such a risk, it might be better to end the session politely and follow up via email to document any potential abuse.
The researcher should always keep the target population in mind and be aware of explainable data mismatches (e.g., a participant transitioning their name or gender may result in data mismatches between screening and participation)
The researcher has to be careful and only deem a participant ineligible if there is clear evidence of a data mismatch.

Scenario 2: Poor Data Quality

A participant has little to no context on the study, and cannot answer questions with specific details. Significant stalling and delays.

In this scenario, the researcher is asking the participant pointed or very specific questions about crocheting, a practice they self-reported as highly skilled at. However, the participant seems to not understand the terminologies and gives quick and abrupt answers.

Suggested next steps:

For future studies, the researcher could keep questions that confirm knowledge about the study context at the beginning of the research session. If participants do not produce sufficient information or are unwilling to talk, the session could be ended early with partial compensation as an option.

Some things to keep in mind:

There is no clear evidence in such scenarios that the participant is fraudulent. Participants may not be able to express to the researcher’s preferred detail or participants may be guarded about their experiences.
The researcher should strive to give participants the benefit of the doubt and be respectful.
The researcher also has to be cognizant of any biases influencing their judgment so as to not unintentionally exclude certain people or populations. For example, people who crochet may have various considerations for yarn choice — color, feel, cost, etc.

Scenario 3: Uncooperative Behavior

A participant is unwilling to do the study procedures and tries to get through the session with minimal effort.

In this scenario, the researcher asks the participant if they could screen share and walk them through a process for a user testing study. The participant suddenly becomes very resistant, and tries to avoid the steps required of the study.

Suggested next steps:

The researcher should include all study expectations and eligibility criteria in the recruitment materials and remind participants that not meeting the inclusion criteria at the time of the session could lead to rescheduling or ending the study.

Some things to keep in mind:

In the event the recruitment or ethics materials are not explicit about video sharing, it could be considered a breach of privacy if participants are asked to turn on their video (especially when triggered by the researcher’s suspicion).
The researcher should be aware of the technology and resource limitations of the target population (e.g., if a participant is using a public computer at the library, they most likely will not be able to install software).

Scenario 4: Suspicious Identity

A participant has a generic email, cannot be found online, or does not have an expected public presence.

This scenario shows the researcher having doubts about the participant not being who they claim to be due to multiple factors that drew suspicion. Examples being getting their name repeatedly wrong, not having any online presence, or not being able to answer contextual questions like — “what do you like about living or working in [reported location]?”.

Suggested next steps:

The researcher can choose to recruit via snowball sampling or through authenticated participant repositories.
If a researcher suspects identity misrepresentation, they should not include the data in their analysis because it could introduce biases and inaccuracies in the data. They should report how many fraudulent participants they did not include in the data analysis.

Some things to keep in mind:

Snowball sampling may not reach a wide range of people.
Authenticated participant repositories may be inaccessible to the researcher or require payment for use. Additionally, relying only on such repositories may lead to always using a pool of “professional participants” and this may impact the study, depending on its nature.

Scenario 5: High Response Rate

A participant is filling out multiple screeners, not really paying attention to screener questions, and using fake information.

In this scenario, the researcher sees a pattern of respondents with similarly formatted information and fake-seeming emails. While this is easy to spot upon close inspection, it is sometimes possible to miss when fraudulent individuals bombard a screener survey in large volumes. Fraudulent respondents may also go the extra mile and present themselves as a very “desirable” participant by providing elaborate details or expressions of interest in the qualitative fields and choose a variety of skill levels and races to increase their chances.

Suggested next steps:

During the screener survey design, the researcher can include attention check features and data triangulation mechanisms (asking the same questions differently to see if they have different answers). See example participant validation guide for ideas.

Some things to keep in mind:

Multiple fraud-prevention measures used concurrently may make the screener burdensome and result in fewer responses.
Fraud-prevention measures on the survey are extra effort for the researcher.

Scenario 6: Red Flags noted by the Researcher

A participant exhibits a number of odd characteristics that stand out to the researcher.

In this scenario, a participant is telling the researcher about their health diagnosis and the age of onset. The researcher is shocked because the statement does not match with what is typical of the disease or what is known to the researcher (the expert in this case).

Suggested next steps:

The researcher should tread very carefully, stall where possible, and consult with mentors or experts.
The researcher should periodically check their biases (see an example bias checking tool here) and regularly engage in implicit bias workshops that provide actionable activities to address one’s biases (see an example workshop report here).

Some things to keep in mind:

The cognitive burden on the researcher will be high in such situations due to the unknowns and having to deal with the situation on the spot.
There is a strong risk of misjudging a participant.

Thanks for reading! In conclusion, our intent with this list of scenarios is to document and share with the community a few different ways in which fraud has been taking place in qualitative research contexts. We hope these serve as a guide to other researchers on what to look out for and be prepared for with regards to fraud. We further aim to improve and iterate on the recommended guidelines moving forward.

If you have experienced fraud as a qualitative researcher and have suggestions too, please leave a comment or email apanicke@iu.edu — I would love to chat more.

Credits: Icons and visuals used to illustrate the scenarios are from Freepik.com

Paper Reference: Aswati Panicker, Novia Nurain, Zaidat Ibrahim∗, Chun-Han (Ariel) Wang∗, Seung Wan Ha∗, Yuxing Wu∗, Kay Connelly, Katie A. Siek, and Chia-Fang Chung. 2024. Understanding fraudulence in online qualitative studies: From the researcher’s perspective. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), May 11–16, 2024, Honolulu, HI, USA. ACM, New York, NY, USA, 17 pages. https://doi.org/10.1145/3613904.3642732 (Author names are listed in order of contribution. ‘∗’ stands for equal contributions where applicable.)

What to do when your research participant is fraudulent in an online qualitative study? 6 scenarios and guidelines for HCI and UX researchers

There have been cases of individuals faking their identity and experiences to gain access to research studies.

What is “fraudulence” in research study contexts? What is the harm it causes?

What are some scenarios of fraudulence and how can qualitative researchers be better prepared for it?

Scenario 1: Data Mismatch

Scenario 2: Poor Data Quality

Scenario 3: Uncooperative Behavior

Scenario 4: Suspicious Identity

Scenario 5: High Response Rate

Scenario 6: Red Flags noted by the Researcher

Written by Aswati Panicker