YouTube recommends…

Peter Kahlert
STS@ENS
Published in
21 min readMar 11, 2022

[This is a translation of our last post]

What does YouTube show you concerning the 2021 German federal election? This was the question we asked in the BMBF-funded research project DataSkop for our first data donation pilot campaign. From mid-July to the end of August, YouTube users could support our research by donating their data. For this purpose, a data donation platform was set up where those interested in donating could find information about the ongoing projects, directly collect, explore, and finally — after explicit consent — donate their data for research via the software. These donations enabled research with data related to actual YouTube profiles, rather than synthetically reproduced personae. Through the data donation software, it was possible to gain authentic impressions of the recommendations, auto-play paths, YouTube’s newsfeed, and the search function. In the context of the last federal election in Germany, this is once again particularly interesting and, in this sense, culturally significant, i.e. justified for research, and at the same time — given the debates about radicalization and misinformation — normatively charges the research.

For our project, however, this is no contradiction. “DataSkop — What happens to my data?”, which is the translation of the full title — is part of the “Digital Autonomy Hub” and deals with self-determination in digital environments in terms of content and normative. With partners from design (FH Potsdam), media education (University of Paderborn & Mediale Pfade), and the NGO AlgorithmWatch, a critical observer of processes of algorithmic decision-making, a data donation infrastructure has been created to support research projects like the ones we are conducting in our pilot projects, which aim to provide critical research and educational pedagogy.

To understand the project’s extensive picture — its idea, and concept — it is worthwhile to take a closer look at the process and concept of data donation we have applied. DataSkop is not only the name of the project itself but also of the software created by AlgorithmWatch, which serves as a data donation infrastructure. This can be found on the project’s website, which has, again, the same name: dataskop.net. The application can lead potential data donors to the respective data donation projects, or in our pilot case lead them de facto directly to our pilot project. For this, one could log into one’s own YouTube profile within an emulated web browser and thus make data available about one’s behavior on the platform. Of course, the data is never donated immediately, but stored locally first for the user to explore. Only at the end of the procedure and after explicit consent the data is being donated — a step that can be revoked any time anyway. The data donation program runs its data collection en bloc, which means that we did not record any user behavior in real-time, but collected the already existing watch history, “likes”, and subscriptions. The rest of the data comes from a set of experiments which are run automatically. The program visits a selection of seed videos under the potential data donor*s YouTube profile and follows the first recommendations (i.e. what would be played with YouTube’s “autoplay” function) for seven repetitions each, looks at the newsfeed and the recommendations in the first “headlines”, and does certain predefined search queries. The results are being recorded. People interested in donating can explore their profile data and find out details about the first DataSkop pilot project during these procedures. In addition, there is a voluntary questionnaire that asks for demographic details, a self-assessment (what is watched most?), and technical details (e.g., whether YouTube is used behind VPN) about platform use. However, no differences could be identified based on these control variables. The software also automatically logs users out of their profiles again and runs the observations on messages and search queries a second time for comparison. Only if consent to data donation is given at the end, the collected data is transmitted.

A few core problems of our research must be mentioned in advance, which stem on the one hand from the data or its collection, but also from the research object of the platform itself. The data scatter nominally very strongly, i.e. we have recorded tons of different videos with most of them only appearing once in the data we analyzed. It makes no difference whether the nominal frequency distribution stems more from the histories of the users or from the recommendation algorithm itself, e.g., the suggested videos. This makes the data very messy but still allows for heuristic cuts. There are a few highly visible videos and a vast and virtually impenetrable field that stretches beyond those. Hence, we added qualitative tactics to our digital methods mix, sifting through document- and media-based discourses for key terms, sources, and content fit for data-analytic attention. We used the resulting overviews of compositions and distributions for further qualitative tracking of traces and topics, which we had compiled from Google Trends as well as from the election programs of the parliamentary parties. This research also grounded the independent variables of our experiments. Additionally, we observed mass and social media discourses at the time of the data elicitation, adapted our experiments in due time, and used these ‘trends’ after the data collection to filter and investigate our data.

In any case, the data sample resulting from the donations is not at all a balanced or representative one. The call for donations alone, which was disseminated via Der Spiegel, but above all via a YouTube video of the channel “Ultralativ”, resulted in a corresponding bias in our sample. After the removal of all incomplete and erroneous data donations from the original 5000 donations, only about 2000 (exactly: 1980) remained, from donors who all presumably had better hardware and internet connections. The analyses below all refer to this smaller, adjusted data set (n = 1980). Similarly, it can be noted for our data donations that they are primarily from men around 25 years old and tend (a bit) to come from northern Germany with a slight focus on Berlin. Our donors top YouTube content interests are “Gaming” (2507), “Science & Technology” (2102), “Entertainment” (1541), and “Education” (1541).

Distribution of frequencies of self-declared YouTube-user-interests of our data donors (as given by the platform in terms of “categories”. Translation from left to right: “Gaming”, “Science & technology”, ” Entertainment”, “Education”, “No self-disclosure”, “Comedy”, “News & Politics”, “Music, animals, and sport”, “Film & animation”, “People & blogs”, “Cars & vehicles”, “Society”, “Travel & happenings”, “Tips & styling”, “Social commitment”)
Distribution of self-declared gender of our data donors. Translation from left to right: “Male”, “Unknown”, “Female”, “Non-Binary” or literally: “diverse”)
Distribution of self-declared age of our data donors. “-1” means no self-disclosure.
Distribution of the first two digits of the postal code by self-disclosure of our data donors. Missing and not applying values have been removed. Note that as a rule of thumb a “lower” number in the beginning of the postal code indicate a rather northern region in Germany.

Smaller differences are nevertheless recognizable in the data, and a few focal points or videos, or channels that are shared by several donors can also be identified.

Scatter-Map of our data donors watch histories. The droplets depict single, individual watch-histories whith no or barely shared items. The scattered center contains the more and most common items and users. The size of the droplets correlates with the extent of the individual watch-history. However, extends of donated watch-histories vary.

For example, in our sample, users predominantly either liked “Ultralativ’s” video on the YouTube algorithm or “Rezo’s” “Destruction of the CDU” video, but rarely both. This had surprised us, but so far does not extend beyond this finding as an observation. Overall, the data scatters too widely and predominantly breaks down into the smallest individual cases, so that connections cannot be qualified without further auxiliary constructs. In addition, the histories are sometimes flooded with videos that have not yet been viewed by the users — presumably an artifact of auto-play and auto-loading on some parts of the platform (e.g. channel pages). While these videos can of course be filtered out, they require further investigation and presumably replace viewed videos from the scraping process. Another problem at the data donor/platform interface is inconsistencies that arise from the use of the YouTube platform in other regions (DE, AT, UK) and the selection of a different language. In this, apart from minor problems with parsing the data, the sorting of categories sometimes differs. Thus, coherence is in question.

This is just one of the platform-specific research problems. A/B tests (e.g. filtering recommendations by tags) present us with similar challenges, and equally problematic is the change in metrics in real-time, such as the number of likes and dislikes, and the not-so-rare event of videos disappearing: for example, one of our seed videos, the one of “CGArvay” on alleged long-term effects of mRNA-vaccination, can no longer be found on YouTube. We do not know why it had been removed. The public broadcasting channels that cavort in our data are removing videos from the platform, too, on a regular basis, as they are switching their content to “private” outside of certain publication horizons.

Although this situation makes research and analysis a difficult undertaking, any research dealing with such complex data and model ecologies must remain a collection of circumstantial evidence. In any case, data donations contribute to the evaluation of validity and utility of synthetic data. However, the two independent requirements of researching such large socio-technical data ecologies are, on one hand, the investigations of the models themselves, and on the other, investigation of their users, their practices, interactions with and between content. The former is needed to form a model-analytical judgment about the settings and assumptions implemented on the platform. The latter, however, is needed to critically understand how the final reality of use comes about, which can only be reconstructed based on empirical data analysis.

Thus, it can be reported briefly in advance: If one looks purely statistically at the largest clusters in the data set, one hardly finds anything spectacular. The recommendation paths scatter into countless individual cases, and unsurprising mass accumulations appear on the surface. It hardly makes a difference whether the data originates organically or directly through usage (e.g. watch history) or from the recommender itself: There is an extreme amount of individual items, and a singularization of videos that ends in the flat, single-digit range, which can be found in most, sometimes even with all data donors.

Distribution of item repetition between donated watch histories. The outlier (1088) is the “Ultralativ” video that has featured our project and call for data donations.
Distribution of item repetition between recommendation paths in our auto-play experiment.

Furthermore, the news feed appears seriously curated, although one channel clearly dominates the feed in our data. The search function appears almost not personalized at all. However, if one looks more closely, switching between the quantitative-analytical and qualitative-descriptive lenses, one notices questionable boundary-crossings and entire niches. As the project’s social science partner, we have focused our research in this first project pilot primarily on reflexive inquiry, description, and, last but not least, methodological experimentation. One focus lies on the phenotypical description of the recommender system, what structural and content references are made on the platform, and what experiential content results from this? In doing so, this research represented experimentation with the “data donation” method itself, an attempt to explore its limits and potential as well as to compare it with synthetic research methods, such as the use of artificial personae.

After long and intensive work with the data — and before the mills of academic publishing have slowly done their grinding — I am pleased to present here a small selection of detailed observations and analyses, as well as deeper insights into the underlying research concept.

For the search function experiment, we selected political or federal election-related terms, such as Baerbock, Laschet, Scholz, for which our donors had done a YouTube search, one time with their account logged in, and another time without logging in, so to speak ‘anonymously’ and non-personalized. It turned out, however, that the YouTube search results are little or not at all personalized — at least for our data donors and the terms we chose. There are a few, albeit minor, differences in the degree of personalization between them.

We compiled a selection of search terms which we identified as relevant in the context of Germany’s federal election, to see what YouTube would show users who were logged into their account (or not). To do so, we looked at Twitter trends, Google trends, current news, and the parties’ election programs and discussed them within the project. The result were seven terms for our search experiment: Scholz, Baerbock, Laschet, Afghanistan, Gendern (i.e. using gender-inclusive language), Hochwasser (i.e. flood), Impfpflicht (i.e. mandatory vaccination), and “Bundestagswahl 2021 wen wählen” (i.e. federal election 2021, who to vote for). “Hochwasser” was only added to the list shortly after the launch of the data donation campaign — after all, the tragic flooding in North Rhine-Westphalia was the dominant news topic at the time of our platform’s launch. Previously, we had agreed in the consortium on another campaign term with an environmental reference: “Benzinpreis” (i.e. gas price). Replacing “Benzinpreis” with “Hochwasser” was based on their discursive relationship across issues of “environment” and “climate change.” We chose some of the terms, such as “Impfpflicht” and “Gendern,” because we expected a polarizing effect that would help to investigate radicalization tendencies on YouTube without having to use explicitly problematic terms ourselves.

However, the results of the personalized (logged in) and anonymous (logged out) search queries hardly differ, as it can be seen in the following table:

Precentage of concordance (logged in/logged out) of search results by query.

The table shows the similarities for the first 20 search results. The order of videos was not taken into account. The search queries whose median shows a similarity of 100% are particularly interesting, as this means that at least half of the data donations were the same in terms of content, regardless of whether a user was logged in or not. That this is especially true for terms relevant to election politics suggests appropriate curation and caution in recommender design on YouTube’s part. In our data, the search results are not based on any personalized curation. A possible reason for this observation we have considered is that little or not at all personalized search results are easier to moderate.

We can further speculate that the discrepancy between median and arithmetic mean (ratio) points to a personalization threshold. That is, a few donors receive personalized search results, while for most the search results differ little to none at all. However, one needs to keep in mind that there is a small fluctuation to be expected with the query results as our pretests have found small and lower ranked parts to vary for one account if the request is not repeated immediately. One possible explanation for this phenomenon would be that results are only personalized above a certain quantitative threshold of investment into a topic. However, randomness would be an explanation, too.

Let us now take a look at the data distribution. For example, the personalized search query for “Scholz” on the mass of 1980 complete data donations only showed 213 different videos in the first 20 results to our logged-in donors* (compared to 206 without log-in). Counting the different channels, we find only 71 different channels logged in and 70 without login. There are only four channels that differ, so together they also only account for 74 channels. If we combine the individual different video items produced by the personalized and anonymous search, the number of videos also increases only slightly — to a total of 221 (from 213 and 206). It is hardly possible to speak of personalization — despite YouTube explicitly pointing out in the privacy settings that deactivating the watch history would impair the performance of the search function. However, this cannot be confirmed with our data.

Comparing the search results with the news feed in terms of personalization, the average match between logged-in and anonymous recommendations for YouTube’s headlines is only 15%. In this case, the median is similar and thus seems to represent — at least for our sample — a kind of general personalization level. The relationship of topic loyalty between search results and news feed is also interesting to look at. The search function does its job with a strong thematic match of search term and result. The search for “mandatory vaccination” videos returned only 116 non-topically related hits, which we identified with the help of our self-defined term set around “Corona”. The 116 hits were composed of only three different videos. The most frequently occurring is a “WELT Nachrichtensender” channel video on the flooding in NRW. On the other hand, things look different when it comes to further recommendations on the newsfeed. Although truer to the topic without logging in, news headlines on Covid-19 rarely refer to other videos that also relate to Corona. That is, based on 75,000 Corona-related videos from the News Feed, only 13,000 recommendations explicitly refer to Covid-19.

The analysis of the news feed and the search queries also shows a strong dominance of the WELT channel. Our project partners from AlgorithmWatch had already investigated the YouTube news headline bias of the Springer medium “WELT” (read here [German]). WELT also performs astoundingly ‘well’ in the search query experiment, although it does not lead on all frequency lists. Accordingly, “Welt Nachrichtensender” is in the “Impfpflicht” search results 18198 times (without log-in) of “vaccination duty”, the second-placed channel “tagesschau” appears already only 5467 times. “WELT Nachrichtensender” also leads by far for “Hochwasser”, “Benzinpreis” and “Baerbock”. Only for the search terms “Laschet” and “Scholz” does the channel deliver the most results.

In the news and the recommendations subsequent to the news, the public service offerings on YouTube can keep up with Springer and other private media in total. In the news and headlines and the associated recommendations, they clearly dominate collectively. YouTube-intrinsic channels, which we define as YouTubers without affiliation to existing media institutions outside the platform, play a subordinate role in this kind of news content, although they populate large parts of the recommendations and shape the image of our auto-play experiment. However, this ratio is also an artifact of the FUNK group of public broadcasters, as a large number, a whopping 55, of the young and more journalistically active YouTube channels belong to FUNK. It gives the impression that YouTube, as the company’s spokespersons claim, is curated editorially, but with little deviation from institutionalized notions of a media mainstream. Whether the masses of videos from “WELT Nachrichtensender” are due to their search engine optimization quality or also to the monetization of content from private media companies or even something else remains unclear. It is certainly noticeable that this channel produces a lot of content output and is using hashtags in its video descriptions. However, one may also note that “WELT Nachrichtensender” uses terms that are not used by public broadcasters. Thus, the channel leads several term-frequency-lists: For example, “WELT Nachrichtensender” is the channel with the most videos in the newsfeed that has the term “Corona” in its title, and it also ranks first for “Covid.” “Covid,” for example, is not used in the newsfeed of public service titles.

In terms of content, the search results are not very spectacular. It takes a thorough search to find relevantly suspicious content from the right-wing spectrum. Our experiments do not seem to have uncovered any explicit conspiracy theory. But we did find formats that flirt with it in one way or another: These are, in addition to appropriately ‘controversial’ providers such as “OE.24”, “TV.BERLIN”, “BILD” also content from the AfD. Business coaching channels with links to the “libertarian” alt-right can also be found, but also those with innocuous explicit content that make use of clearly right-wing jargon and framing for implicating titles, etc. There you can find, for example, bizarre formations of YouTubers who promote videos with “CANCEL CULTURE”, supposedly put political correctness up for debate, but use gender-sensitive language even in their video and channel texts. Also, some of these channels use anti-etatist jargon while appearing moderate to conservative in content. If one visits the comments on such videos, many do not seem to be bothered by the mere superficiality of such stylistic devices, just as the channel operators do not seem to be bothered by the openly state-skeptical, right-wing drifting comments. However, these are found in the outer ranges, between the countless individual recommendations. For example, after “Tagesschau” news, without logging in, there are still 18 recommendations for the channel “RT Deutsch”, all but two of which openly fuel Corona conspiracies. With log-in, there are still two videos, both of which contain so-called “Corona-skeptic” content and one of which is close to the AfD. Explanatory anomalies in the history could not be found in the affected donation profiles.

YouTube’s “answer” to the question of who to vote for in the German federal elections consists primarily of party-political overview videos and educational material on Germany’s political system or election research. Among the items shown to almost all data donors (over 1950 from 1980) are titles such as “WAS SOLL ICH WÄHLEN? — REALTALK über Politik & Parteien!” (i.e what should I vote for? — realtalk on politics and parties“), “Bundestagswahl 2021: aktueller Bundestrend (Linke | SPD | Grüne | FDP | CDU/CSU | AfD)“ (i.e. „federal election 2021: current trends! [list of big German parties]), or “Einfach erklärt: Wie funktioniert die Bundestagswahl?” (i.e. explained simply: how does the federal election work?). Only further down in the results are more droll surprises to be found, such as the “Der große Cannabis-Check der Parteien” (i.e. the big cannabis check of the political parties).

For the auto-play experiment, we selected start videos (seeds) that had political content or references and were promising in terms of pathing and branching. The following analyses consider the videos across all recommendation paths. On the one hand, the path origin itself is relevant for our research. On the other hand, part of the strategy of the experiment is to obtain, at least with some probability, smaller “seed” experiments, i.e., video accumulations across seed videos and donors* that are virtually instructive for analysis. Our research interest here had two foci: First, which media experiences — in terms of semantic order — can be qualitatively identified and quantitatively assessed? And which forms of recommendation paths can be found, are there roundabouts, highways, or the like?

The first question refers to the narratives and interconnections that the recommender draws, assembles, and reproduces. The second question about the paths of the recommendations conceptually interrogates the data not so much in terms of content but in terms of form. Here we are concerned with the phenotypes of the YouTube ecosystem. Let us first look at the paths of recommendations before we also discuss the content dimension of recommendations.

Does YouTube’s recommendation system show us rather different videos within each auto-play path, or does it repeat — even in a single path — videos (perhaps even the seed video)? The latter was more often the case than the scenario of all videos in a path being different. It could be an artifact of our method to only briefly visit videos, and not playing them all the way -, however, the watch history recording during the experiment was disabled. Another reason for doing so was to keep the donors’ histories clean from our experiments. Yet one needs to note that YouTube’s recommender does barely show videos that have been viewed from 90 to 100 % and is mostly covering completely unwatched content — at least for our data. Between those poles there is a distinct, flat, and edgy U-curve, as YouTube seems to regard half-seen content rather uninteresting for its viewers.

Relative distribution of “roundabouts” per seed: had an auto-play chain contained at least one video at least twice?

Most of those ‘roundabouts’ (wherein videos appeared several times) were found in recommendation pathes of Rezo’s “Die Zerstörung der CDU” (i.e. the destruction of the CDU) video — a really striking item that caught a lot of media and public attention when it had been uploaded in 2019. This one not only contains the most redundancy, but also the most extreme cases, like full loops (instead of back and forth patterns) or repeating items with up to four repetitions in one auto-play path. This was somewhat surprising but could be due to the age of the video — at least this would be a clear unique characteristic in comparison to the other seeds.

Example of a “path-map” containing roundabouts (here between the videos “Wie und ob ApoRed denkt”, “Eine ausführliche Analyse von ‘Bruder vor Luder’” and “Eine ausführliche Analyse von ‘Der Lukas Rieger Code’” within one data donation. The different colours represent different seed paths, the arrow points at the video being auto-played (pointing at) from a source (pointing from).

We assumed that a video with such mass attention in the past and over time would be highly resolved in the platform’s inter-item and contain many subtle differences in recommendations. However, the picture is different. Video dispersion, the ratio of the number of different follow-up videos versus the greatest possible variation, was significantly lower for the Rezo video versus a recent and less-clicked seed video from Rezo’s secondary channel “Renzo” concerning a possible online TV-duel between Germany’s chancellor candidates.

We were also able to find a type of “highway access” or “funnels” in the data.

Examplifying illustration of a “funnel”-video P with numerous sources but only few top/auto-played recommendations.

This means that in the auto-play paths a video is preceded by many videos (i.e. input degree) but followed by only a few videos (i.e. output degree). Such funnels seem to occur specifically and sometimes to an extreme extend (e.g. with an input degree of 38 versus an output degree of 1) when formats explicitly (by concept) and implicitly (by consumption standards) belong together. These are, for example, multi-part documentaries, podcast series, or other sequential videos. These then unroll over their individual parts and then sometimes run back into non-funnel formats. In almost all cases, there is high channel fidelity. That is, the channel of a video is often also the channel of recommendation.

In addition, “forking”-roads can also be found. For example, a Renzo video in our data has an input degree of 8 and an output degree of 56. With a frequency of 345 in the overall experiment, it occurs very often but is only accessed via eight previous sources. This phenomenon occurs with channels that form a kind of cluster around our seed videos. This cluster consists of the seed channels themselves, directly associated channels, and other — depending on the perspective — quite related channels such as “Kanzlei WBS”, public broadcaster channels, SpaceFrogs, and Ultralativ (as was also to be expected with our sampling bias). Particularly noticeable — and part of the seed — are MrWissen2go’s channels. In addition to the pseudo-bifurcations of our seed videos, which top the table in their absolute frequency and have correspondingly high output degrees, there are not only countless other MrWissen2go videos in our data but also one (namely “The Dirty Business of Ghostwriting”) whose output degree (in addition to high input degree) can definitely compete with the seed videos. Moreover, the video is accessible from over half of the seeds. This video acts as a kind of highway in our sample. Ultralativ’s channel is equally distinct. This channel represents a series of on-ramps — users are led from various sources (but also including themselves) to fewer references — and again, mainly the source channels’ content — which is a regular case throughout our data.

However, it is again worth paying attention to rare events and pecularities. There are many points of contact between the recommendation paths of maiLabs “7 critical questions about vaccination” and the video of CGArvay (to the person) about long-term consequences of mRNA vaccination (the video is no longer available). Sometimes one also finds esoteric and other conspiracy-oriented contents related to both. It rarely occurs and appears in it thematically mediated — and counterfactually connected via the opposition: Such paths begin with the CGArvays Seed and find their way back via RKI content to videos whose content is not based on institutionalized knowledge. In the process, one also encounters the figure CGArvay again. CGArvay’s channel does not appear anywhere other than in the set seed in the whole auto-play data — in plain contrast to all the other seed channels that do reoccur. This is a unique feature of this seed video. Overall, the impression is that the respective source video is more determinative than the history of the various users.

Structural overview of the auto-play experiment’s result. The mapping is optimizing for closeness, thus the outer bubbles can be understood as source specific niches with the center containing the most common and shared recommendations. The colours code the most frequent channels. Yellow: ArteDE, Green: Spacefrogs, Light Blue: NDR Doku, Brown: PietSmiet, Orange: MrWissen2go, Pink: FlipFloid.

However, this may be an effect of our highly self-selective sample. Yet, there are counter-examples at hand. Looking closely at the more redundant auto-play extremes, some seedpathes are merging and get stuck in a specific topic, genre, or channel in strict contrast to the common cases of video-specifity. Therefore, we continue to work on adding metadata to expand the analysis capabilities and to look at the organic data in comparison to the synthetic data from the pre-conceptual phase and the pretests. There are many phenomenal figures to discover, describe, and catalogue. However, whether we observe coincidence at work (e.g. with the countless single video items in the data) or great specificity of recommendations, we cannot tell from looking at the data alone.

This concludes the excursion through our analysis. The bottom line is that for all the observations and interpretations, the main impression remains: this type of platform research offers much to observe and describe, and yet the analysis, not only in our case, remains a circumstantial process. There are, of course, leads to follow, finer, more complex procedures, and follow-up surveys. Ultimately, the explanations themselves remain purely speculative in the end. Are the content-related tightrope walks between coquetry, jargon, uncertainty, and disruption the result of direct interactions between user and item, or does this also include more abstract content and semantic models? What are the criteria for curation? After all, YouTube is a commercially-run platform, but given the key infrastructural position it holds, it must balance the interests of users against its own, and also take social responsibility not only for content but also for the distribution and connection of content. Especially looking at the role of public broadcasters on the platform (and the platform’s role for public broadcasters), creators and a fortiori media services need to be able to actively participate in controlling and thus take responsibility for the relational position of their videos in YouTubes net of content. Research like ours is also a reminder of the difficulty of such a task, even for the platform operator itself. Without waving too much at the complexity of society: certainly, there is no need for power fantasies. And as the Ultralativ video, whose featuring we owe much of our data donations to, rightly points out: unconditional transparency is (in this case) no solution either. But transparency vis-à-vis independent observers could help to render and keep such vast and volatile platforms, which are actively and involuntarily in flux, inclusive, fair, and enhancing media responsibility and accountability. Finally, despite this line of sight and the manifold limitations and problems that data donation-based research can pose, there is certainly a need for more data donations, not less: For more research opportunities, for more independent consideration of the unpredictability that cannot be represented even in transparent models. Only with the authentic merging of data and experience can socio-technical ecologies be captured in their historical specificity and emergent reality.

--

--

Peter Kahlert
STS@ENS
Editor for

Sociologist/Researcher @ European Newschool of Digital Studies (European University Viadrina). Currently working on DataSkop, funded by BMBF.