Informed consent: vetting research software for privacy

Noreen Whysel
researchops-community
8 min readMay 7, 2024

--

Woman with blonde hair in a bun facing away toward a computer screen.
Image: Pixabay.

We’d like to be sure that the data about our research participants stays between us and the test participant, but are our participants fully aware of the data sharing agreements underlying their use of the testing tools? The confidentiality agreement they have with us is only part of the picture.

In this article, I’ll discuss how to ensure that your participants know how their data is collected and how it might be used or shared beyond the scope of the covered research product. I’ll focus on a mini audit of several user testing software packages that we performed based on the 10 attributes for respectful Me2B commitments that underlie the Internet Safety Lab’s ISL Safe Software Specification:

  1. Clear data processing notice
  2. Viable permission
  3. Identification minimization
  4. Data collection minimization
  5. Private by default
  6. Reasonable data use & sharing / Me2B deal in action
  7. Data processing behavior complies with data subject’s permissions and preferences
  8. Data processing behavior complies with policies
  9. Reasonableness of commitment duration
  10. Commitment termination or change behavior

Source: “The 10 Attributes of Respectful Me2B Commitments,” Internet Safety Labs

First, some definitions:

  • “Me2B” is a flipping of the traditional shortcut, B2C or Business to Consumer, relationship and is designed to put the individual first.
  • “Me2T” is your relationship with the technology itself.

To understand the background let’s take a brief look at the data privacy legal landscape in the US. I’m not a lawyer, so this is really just a broad brush overview. Any legal questions should be discussed with your corporate counsel.

Data governance

Participant data may be collected in a number of ways, such as entering numbers or text directly into forms, entering it into an account profile (if you have one) or via an aggregated profile obtained from third party data brokers. Behavioural data also may be collected from third parties or your own app use.

Those of us who collect, use and share data from our research participants are becoming subject to a greater and greater number of data protection laws. Each law has varying degrees of requirements, usually based on where the data subject lives, so you want to be sure to get your data governance policies right. And it’s fair to expect the same from usability software that collects and controls data from you and your participants.

Data handling in practice

Researchers collect and store data with a number of different tools that in turn use underlying technology that may also access this data. Knowing what entities might have access to data through the testing platform’s relationship with these underlying tools can help you to evaluate whether you are exposing your team or your participants to risks that come with these technologies. We like to call this the “Me2T” relationship and it is largely hidden from the user.

Lack of notice and consent to share data present significant risks.

Notice of data sharing and consent are key components of many of the data privacy laws that govern which data we can and cannot save, use or share. While the risk to the researcher is similar to those of the user testing platform, the platform also bears responsibility for ensuring that anyone participating in a test on their platform has an appropriate level of notification that the data is being collected and shared, and subsequently allow the participant control over whether they continue using it.

Data safety audit

Researchers collect and store data with a number of different research tools, and that creates that Me2T relationship between the individual and the technology. We created a mini audit based on our safety specification. It is not a scientific study, i.e., we didn’t do a randomised sample and it only reflects the software packages that either we use in our own research or those that we’ve documented from forums that we participate in. However, the results brought up some interesting questions. (As a note, these are all companies that I have used and am comfortable using).

Table 1: Data sharing by vendor

A table comparing different vendors or tools for sharing data or information with major tech companies like Google, Facebook, Twitter, Amazon, Microsoft, and other third parties. The rows represent various vendors such as Prolific.co, Microsoft Form, Google Form, SurveyMonkey, OW TreeJack, User Interviews, Usability Hub, and Typeform. The columns indicate whether each vendor shares data with the respective tech company or third parties, marked as “Yes” or “No” in the correspon
Source: Internet Safety Labs. Note that Usability Hub is now Lyssna.

You’ll notice from this list that most of the software we looked at shares data with Google and other external vendors. One shared data with Facebook’s ad network and two shared with Amazon and Microsoft (including Microsoft Forms).

In Table 2, you can also see that just for these eight vendors, there are a few dozen companies or company assets that are receiving data. The ones in bold are advertising or tracking software, which often have agreements to sell the data they collect through data brokers. Many of these tools aren’t necessarily exploiting user data, but they are doorways to entities that now have some access to your participants’ data and your participants should know about that.

Table 2: Third party data vendors discovered in this study

The image lists domains and services associated with online tracking, analytics, advertising, and user engagement across three columns. The first includes services like Ada, Azure, Facebook, Google products, and tracking tools. The second has Google services like remarketing, fonts, and tag manager along with Microsoft, LinkedIn, and analytics platforms. The third covers analytics trackers, survey tools like SurveyMonkey, UserInterviews.com, and tracking codes.
Source: Internet Safety Labs

Methodology

To do the analysis, we used a tool from Evidon called Trackermap that exposes tags that allow data sharing between entities. What you’re seeing below is a map of the underlying technologies that expose data from Google Forms and Microsoft Forms. Trackermap is a paid platform that is bundled with Evidon’s Tag Auditor product, but there are free tools, like Augustine Fou’s Page Xray, that maps server and data tracking requests.

Results

Trackermap scans for various requests by external sites. We were particularly interested in advertising (blue), analytics (red), and trackers (gold), as these are most likely to be integrated into a data broker network.

Diagram showing adtech, trackers and related technologies we found in Google Forms, including Google Fonts and Google Images widgets.
The image displays a color legend categorizing different types of online services or tools. The categories include Ad (blue), Analytics (red), Unclassified (gray), Privacy (pink), Publisher (purple), Tracker (orange), and Widget (green). This legend corresponds to an analysis or breakdown of various web technologies or third-party services integrated into websites or applications.
Fig. 1: Google Forms Trackermap. Source: Internet Safety Labs.
A network diagram showing Microsoft Form at the center, connected to cdn.forms.office.net, browser.events.data.microsoft.com, js.monitor.azure.com, c.office.com, and Bing Ads nodes, illustrating the various Microsoft services and domains it integrates with.
The image displays a color legend categorizing different types of online services or tools. The categories include Ad (blue), Analytics (red), Unclassified (gray), Privacy (pink), Publisher (purple), Tracker (orange), and Widget (green). This legend corresponds to an analysis or breakdown of various web technologies or third-party services integrated into websites or applications.
Fig. 2: Microsoft Forms Trackermap. Source: Internet Safety Labs.

We started with Google Forms and Microsoft Forms because they are popular, free tools that don’t require a lot of expertise to set up. While we expected to see a lot of sharing within their own advertising networks, we only saw Microsoft sharing with Bing Ads. Google Forms did not share data with their advertising network.

Can the participants see this? Well, Google doesn’t require it, but researchers can add an additional description with information about the study and details for informed consent, if they choose to. Significantly, most of the form-based surveys that we reviewed didn’t actually do this.

A savvy user may see that Google has its own privacy policy at the bottom of the form. That’s one potential relationship, but the Google Forms survey we reviewed also indicated that there was another company involved, a panel recruiter called SurveySwap. This is another Me2T relationship. This means that there are a few third party technologies in play here (Google and the panel recruiter), but no reference to the consent practices for any of these underlying relationships other than Google’s privacy policy link. So maybe Google Forms doesn’t share much, but in this case, the participants in this survey are potentially exposed to data sharing by the panel company (see the Tracker Map results formSurveySwap below).

A diagram depicting the surveysswap.io platform and its integration with various Google services like Analytics, Tag Manager, Fonts, Recaptcha, AdWords Conversion, Dynamic Remarketing, and DoubleClick, as well as third-party tools like Hotpiscuit widget, Clickcase tracker, Segment, Mixpanel, and Customer.io.
The image displays a color legend categorizing different types of online services or tools. The categories include Ad (blue), Analytics (red), Unclassified (gray), Privacy (pink), Publisher (purple), Tracker (orange), and Widget (green). This legend corresponds to an analysis or breakdown of various web technologies or third-party services integrated into websites or applications.
Fig. 3: Surveyswap Trackermap. Source: Internet Safety Labs.

We ran a few other tests. The table below shows the number of trackers, ad networks and analytics packages for several products commonly used in user research.

Table 3: Ad networks, data trackers and analytics packages by vendor

A table comparing different vendors of forms, panel recruiters, tree tests, card sorts, usability tests, and first click tests in terms of the number of ad networks, data trackers, and analytics tools they employ.
Source: Internet Safety Labs. Note that Usability Hub is now Lyssna.

Below are the tracker maps from live tests at the usability testing platforms that we examined, and you can see that these platforms share to both DoubleClick and Google Analytics:

A network graph showing the trackers, ad networks, widgets, and analytics tools used by UserInterviews.com and Prolific.io. UserInterviews uses various Google, Facebook, Microsoft, and third-party services, while Prolific uses fewer trackers and widgets.
The image displays a color legend categorizing different types of online services or tools. The categories include Ad (blue), Analytics (red), Unclassified (gray), Privacy (pink), Publisher (purple), Tracker (orange), and Widget (green). This legend corresponds to an analysis or breakdown of various web technologies or third-party services integrated into websites or applications.
Fig. 4: Trackermap results for Usability Hub’s (now Lyssna) Usability Test and First Click Test and Optimal Workshop’s TreeJack tree test and Optimal Sort card sort. Source: Internet Safety Labs.

The survey vendors we examined tended to have a smaller number of tracking vendors:

A visualization of the tracking elements used by the survey software TypeForm Form and SurveyMonkey. TypeForm Form has no trackers or ad networks, while SurveyMonkey uses Google Tag Manager and New Relic analytics.
The image displays a color legend categorizing different types of online services or tools. The categories include Ad (blue), Analytics (red), Unclassified (gray), Privacy (pink), Publisher (purple), Tracker (orange), and Widget (green). This legend corresponds to an analysis or breakdown of various web technologies or third-party services integrated into websites or applications.
Fig. 5: Trackermap for TypeForm and SurveyMonkey forms. Source: Internet Safety Labs.

The third group that we looked at was panel recruiters, where we saw a lot of data sharing with entities like Facebook Ads, DoubleClick, Microsoft Marketing and Adobe Metrics:

A diagram illustrating the trackers, ad networks, widgets, and analytics used by Usability Hub for usability tests, first click tests, and Optimal Workshop’s tree tests and card sorts. Mostly Google services are employed across these tools.
The image displays a color legend categorizing different types of online services or tools. The categories include Ad (blue), Analytics (red), Unclassified (gray), Privacy (pink), Publisher (purple), Tracker (orange), and Widget (green). This legend corresponds to an analysis or breakdown of various web technologies or third-party services integrated into websites or applications.
Fig. 6: Trackermap results from UserInterviws and Prolific.io. Source: Internet Safety Labs.

…you should be asking yourself whether your participants are aware of these relationships and whether … [vendors] have access to the data they provide to you.

It’s important to note that panel recruiters create a relationship with the participant at the time when the participants create an account with the recruiter, usually before they sign up for your study. It’s not a relationship you control, and it is not likely that your research data is shared with the recruiter unless you use their platform to run the survey.

When you look at these results, you should be asking yourself whether your participants are aware of these relationships and whether they are aware that these entities might have access to the data they provide to you. We feel it’s a good idea to remind participants of any Me2T consent relationships that they have already entered into when they participate in your study.

What else can you do?

Product development is flawed. Often there is no consent at all when testing with potential users. What are some of the other things that you can do to ensure that you are fulfilling your role as a data collector?

Researchers should be advocating for informed consent, highlighting all of the potential recipients of the participant’s data, and referencing in the informed consent document any additional data policies underlying the usability, platform, software or panel recruitment programs that are in use. And you should make all of this part of your vendor selection process.

Software testing platforms should take a closer look at their data protection responsibilities and make a greater effort to inform participants and test creators of the data sharing policy, not just once, but every time they use your software.

View Noreen’s lightning talk, “Informed Consent: Are Your Users Aware of What They Share?” at USENIX’s 2022 Symposium On Usable Privacy and Security (SOUPS 2022).

--

--

Noreen Whysel
researchops-community

Co-Founder @DecisionFish, Bizdev @DisruptiveTechnologists, Researcher @InternetSafetyLabs @AMPS_Research & @KantaraRIUP, Adjunct @CityTech, IA/Research/MSLIS