Authors: Bogdan Velişco, Fatema Hassan, Martin Sterum Greibe, Sofie Lundstrøm Weywadt, Sune Larsen
The debates surrounding the topic of genetic modifications in agriculture and food production are widely spread across the internet. The use of genetic modification methods in the production of food is something a lot of people have opinions about since it affects everyone. With that in mind, it is important to investigate this field, the controversies within it and the ways it is articulated by experts as well as the general public.
In order to do this, we wish to promote an understanding of the happenings within this topic, focusing on the way it is being expressed and presented online, specifically on Wikipedia and Reddit. We will do this by using data from Wikipedia to represent mostly the expert view, while we will use data from Reddit to represent the discussion from a predominantly layman point of view (Whatmore 2009).
Starting with Wikipedia, we will now elaborate on the story of the Pusztai Affair. The Pusztai Affair is a controversy that began after the scientist Árpád Pusztai went public with the preliminary results of his yet unpublished research in 1998. He was investigating the effects of genetically modified potatoes on rats and the possible consequences they might have if consumed by humans. The claims that he went public with were that the genetically modified potatoes stunted growth and repressed the rats’ immune systems, and also thickening their mucosa. The research was being conducted at the Rowett Institute in Scotland. Initially, the institute supported Pusztai’s claims, but after his comments on the TV programme “Worlds in Action” started a controversy, the institute stopped supporting him. Pusztai was suspended and his data was seized.
The Pusztai Affair can be used to exemplify one of the many controversies that surround the topic of genetic modifications in food production. In terms of safety, dangers and the future of the technologies within the field, there is a myriad of arguments to be brought up — both for and against — and especially in digital spheres such as Reddit, these discussions have blossomed. The significance of pinpointing the Pusztai affair is to present a case that can help promote an understanding of GMOs in a broader perspective.
It’s important to acknowledge these discussions, and in the following, we will make an effort to understand why that is, as we move along. By analyzing the data available to us, we wish to make these debates more tangible, and through visual maps and graphs assess the articulations that flourish in this battlefield of opinions (Venturini et al. — Visual Network Analysis).
To harvest data from Wikipedia we chose to scrape pages from the category ‘Genetically Modified Organisms in Agriculture’ and its subcategories. This provided a list of 75 Wikipedia pages. Later we made a network of Wikipedia pages referring each other by in-text links. To collect data for this, we scraped the 75 pages for links and sent a crawler to follow these links and scrape the next page for links. We asked the crawler to do this on two levels.
After this, we applied a keyword search to the network. This data we collected by scraping the 75 pages for chosen keywords. We made a revision timeline of the Wikipedia page ‘Pusztai affair’. To collect data for this we scraped ‘Pusztai affair’ for its revision history.
Lastly, we made a map of a semantic analysis. To collect data for this we scraped the 75 pages for all the text.
We chose six relevant subreddits r/GMO, r/GMOinfo, r/GMOfacts, r/GMOmyths, r/Monsanto and r/GMOnews, to which we applied a script provided.
We created a Reddit scraper using Reddit’s API. the crawler was named “GMO scraper on Reddit” which, due to the structure of Reddit, is able to identify factors such as type (submissions, comments), date, author, subreddit, parent id and others and scrape it into a spreadsheet.
We then ran the Reddit script calling the API with the value of 500 hits based on “top” submissions and comments.
Due to the different sizes of the subreddits from which we gathered data, it resulted in a variety of hits. This resulted in 10244 hits in total spread across 6 Subreddit pages.
The timeline above shows a spike in the number of revisions in 2012, and then a decrease in interest over the years until the present time. The spike might have been provoked by the fact that a similar event, the Séralini affair, happened in 2012 and might have given birth to more interest in the topic. The revisions are mostly grammar changes or slight fixes, such as adding years and correcting citations, but it still signals an increased interest in the article and, most likely, a higher number of people accessing it. The decrease of revisions over the years might have been because there were less and less issues to fix after the 2012 spike in edits and improvements to the text. Another reason might be that it is a natural decrease in interest as the novelty and relevance of an event could decrease with time.
The timeline above shows 2015 as the year with the highest point in revisions. It is also shown as the year where the “Genetically modified potato” was made as a page on Wikipedia, as far as we can see from the edit section of the Wikipedia page. That in itself might be the reason for the high number of revisions, as it was a new article, in development. From the timeline, we can see that the creation of the article is actually shown as the year 2014. It might be that the term itself was added that year, and the information regarding it was added later on. We can also assume that some of the interest in the topic in the year 2015 appeared because of the ban South Africa put on genetically modified potatoes, but we cannot know for sure.
To the left of the map above we can see a small brown cluster being centred around the page Genetically modified maize. Surrounding it we see, for instance, the Séralini Affair, the Gilles-Éric Séralini, and the Corngate pages. Given that there are edges between this cluster and the pink cluster (Genetically modified food controversies), it could be deduced that there are more episodes of GMO food controversies than the Pusztai affair. The pink cluster centres around Genetically modified food and has edged connections to every other cluster in the network. Both the size and the amount of edges that this node has, means that we can deduce that this is a field plagued with controversy, due to the nature of the pages in the network. Ranging from Center for Food Safety being edged with GMO Answers to List of genetically modified crops.
Below are 6 visualisations showing this same map, but where we have highlighted the frequency of the use of our selected keywords:
Semantic analysis (Wikipedia)
In the following, we will present a semantic analysis of the GMO controversy based on text scrapes from 75 Wikipedia pages.
If we make a full-text scrape from the Wikipedia category and run it through a word cloud generator we will see which words are mentioned the most throughout all the Wikipedia pages.
Not surprisingly, food, genetically and modified stand out. We also see that the company Monsanto is one of the words mentioned the most times throughout the Wikipedia pages. This tells us that Monsanto plays a huge part in the GMO field.
However, it’s important to acknowledge the drawbacks of using a word cloud generator to understand the general term that circulates within the topic. As can be seen in the cloud above, words such as most, because, and using also make an appearance.
It can be beneficial to use the word cloud generator as a tool to identify themes to create an overview prior to using another tool to dig deeper into these themes. In this instance, we have used Cortext, which is a tool that looks for nouns that appear together often in texts. Thereby it can help us uncover a deeper understanding of the semantics within a given Wikipedia category or on Subreddits.
If we run the text scrape from the Wikipedia category through Cortext we see that 6 clusters appear. Three of those are very dense, which tells us that there are three great discourses within the category of GMO in agriculture, one being about crops connected to genetically modified maize. Another being about genetically modified food. The third being about human health. We also see three minor discourses, one being different words connected to Roundup. The second one is about proteins and genes close connected to butterflies and moths and the third one is about modified foods.
Dynamic map (Reddit)
Above we have created a visualisation that illustrates the progression of activity over time in data gathered from six GM-related Subreddits.
The network is a bipartite one, meaning that there are two types of nodes with two different attributes. The nodes in this network represent commentators and submission IDs of the submissions on which they have commented. They are connected through edges, where the edges in this instance are representing activity through commentary. The timeline spans from 2012 to 2019, we can use this information to identify — and possibly correlate — instances and happenings on the topic of GMO with the activity in these Subreddits. For example in late 2015 when Hillary Clinton ran for the presidency in the United States, there was a post made in the GMOInfo Subreddit where her alleged support for GMOs by the Gates Foundation was discussed, which would probably lead to a spike in comment activity considering the worldwide interest in US politics. The likes of these real-life events that lead to discussions, can, therefore, help us understand the contexts within which they appear.
Beneath is a static map of the above mentioned dynamic map.
Here we have an annotated version of the same map, where we have highlighted some of the more active users. We have classified them as active due to the relatively high amount of edges they have connected to them.
In the purple cluster we have identified users and submissions from the Monsanto Subreddit, and their placement in the network tells us that this particular group of users is not as interrelated in our other Subreddits as the users seen to the right of the map — where these users and submissions cluster together more tightly to other Subreddits. This could mean that the topics in the Monsanto subreddit are perhaps not as centred around the topic of GMO as we had expected. Furthermore, we could also deduce that the commentators within this subreddit are not the same commentators as from our other chosen Subreddits, due to the number of edges going out from the purple cluster.
In the clusters to the right, however, evidently, they are more interconnections between the Subreddits and the users engaged. However, what is also evident is the amount of activity-specific users have in the network. These particular users could both be submitting posts and commenting on posts, which could both prove beneficial to the community but also be a drawback. In terms of keeping the Subreddit and the community alive, it can be seen as a positive attribute. On the contrary this very activity — if not complemented by other inputs — can steer the direction of the discourse to suit the narratives of select few individuals which then results in a heavily biased discussion — akin to an echo chamber.
GMO: The major spikes happened in 2015 and 2018. The 2015 spike is most likely generated by the lawsuit against the US government for withholding GMO related info or the EU law change regarding the possibility for the Member States to restrict or prohibit the cultivation of
GMO Myths: The spike happened in 2015–2016. One of the reasons, again, might be the aforementioned change in EU law. The slight delay in the spike towards the end of 2016 can be explained by the activity of the users trying to debunk or revive myths in order to determine which GM products should be banned in their country and which should not.
GMO Facts: Spike 2013 and a small increase in 2018. In 2013 the mass GMO protests occurred, which brought GMO to the discourse of the general public. In 2018, the event that might have triggered the spike was the U.S.D.A. announcing a G.M.O. labelling standard.
GMOInfo: The spikes happened in 2013, 2015 and 2018. This was probably provoked by the same significant GMO related events that were described in the explanations for the previous subreddits.
GMONews: There is a steady small spike from 2013 to 2016. It might be because news is steadily in the interest of people. Also, it can be argued that people who are interested in the news more likely to hold a steady interest in various topics.
We can observe three main spikes in the amount of submissions and comments, over time, on the six different Subreddits showcased in the timeline. The first spike, specifically for the pages GMO, GMO Facts and Monsanto, happened in 2013. This can be attributed to the mass protests against GMO products and Monsanto in the same year. The difference in the amount of comments/posts in between the subreddits might come as a result of a different amount of users, GMOInfo and GMONews having two or three times fewer subscribers than the other subreddits. GMOMyths and Monsanto have almost the same number of subscribers as the top Subreddits, but fewer submissions and comments; this might be a result of less interest or relevance at the time. The second spike comes in 2015, and this is, likely, the result of a modification of the EU legislation in 2015 regarding GMO, where the member states were able to make their own decisions about GMO products. GMOInfo and GMOMyths were then more in line with the top subreddits, while GMONews and Monsanto still remained to be less popular. The spike in 2018 is the smallest out of the three, and even before it we can see a steady decline in the submissions and comments number. The spike in 2018 might have been an effect of Golden Rice to be approved by the FDA in the US, as seen on Reddit.
Upvotes over time
We have chosen this due to the nature of Reddit. On Reddit, the activity of posts doesn’t necessarily correspondent to the activity in upvotes.
The Graph shows us as we expected that the amount of upvotes followed the amount of post and submissions on the Reddit page. It’s shown that there is a spike in 2013 just like the former graph. The spikes in 2015 and 2018 most likely have the same reasons as the ones stated on the other graph, meaning big events like a change in legislation, be it the EU or the US.
We can say that the fact that the number of upvotes declines with a slower pace might be because of the fact that there are periods with events that generate more agreement and less controversy, thus less discussion/ comments and more upvotes. It can also be due to the nature of the upvotes, which are more easy to give compared to writing a comment, so people can have an inertia effect to it for longer periods of time, acting on habit. It also appears that the Monsanto Subreddit generates more activity through upvotes, which can mean that it is more likely for people to agree on posts on this page rather than comment on those, maybe due to the fact that information about Monsanto is not something that can be easily gathered and formulated an opinion on.
Cortext map for Reddit
The Reddit map has GMO food as its main cluster, with some clusters surrounding it containing keywords such as ‘negative consequence’, ‘health consequence’, ‘deficiency’, ‘defect’, in short, focused on what we can call ‘risks’. This shows that Reddit’s discourse is centred around problems and issues (as compared to Wikipedia, where it’s centred around safety and solutions). Several clusters on the right form a science/health group and showcase the scientific discourse. Interesting to notice is that we have two anti-GMO clusters in the top left and in the lower right. One of them is directly connected to scientific clusters, while the other one is separated from all other clusters and distanced from ‘science’. We can conclude that some ‘anti-GMO’ groups separate themselves from the scientific discourse, avoiding the scientific terminology and sources, while the other uses scientific terms and is more close to science per se. The ‘anti-GMO’ clusters are the only ones with space in between them and the other clusters, and this can show an unwillingness to join the main discourse or just a natural separation as it is an anti-movement.
If comparing it to the Wikipedia map, we can see that in the Reddit one, the clusters are much closer to each other and way more interconnected. This can be showcased as inherent to the nature of the platforms, one being a social one and the other being more connected to science and facts. Therefore, the fact that the users debate on different topics makes them closer to each other’s points. The Wikipedia map has ‘Safety’ as a separate cluster, which we can’t encounter as a topic in the Reddit map. This might show that the Wiki community is more concerned with safety related to the GMO topic, as compared to Reddit.
We will now compare two timelines; one calculating edits per year on the Genetically Modified Food Controversies page on Wikipedia and one calculating the activity on six selected subreddits over time. We begin with the Wikipedia timeline.
The first and biggest spike happens in 2013. This can be explained by numerous events. One of them, described in earlier interpretations of the timeline spikes, is the anti-GMO march. Several lawsuits of Monsanto against farmers brought to the attention of the public that year could have had an effect as well, at least in encouraging more discussion on the topic.
In this timeline, the second peak of the edit activity is the year 2016. Our assumptions about the reasons for this peak are the following: one can be the fact that 2016 was the year where the topic of labelling GMO foods was in the centre of the public debate, and that might have resurrected some interest in the GMO topic in general. Another reason might be the fact that in 2015, the EU legislation regarding GMO’s was changed, in ways that were mentioned earlier. This change might have provoked some controversies, that started unfolding at the beginning of 2015 and came to closure only by the end of it.
A barely noticeable spike happens in 2018. In the other timelines, it is more evident and carries more impact. We can even notice a steady decrease in edits from 2012 and onwards. This may be attributed to the decrease in interest in the topic of GMO’s, or to a decrease in active users. One of the major events in the GMO field in 2018 is the fact that U.S.D.A. announced G.M.O. labelling standard. This might have had a bigger impact on Wikipedia as it is a U.S. based company.
Activity on subreddits over time
The timeline above shows the accumulated activity of the six Subreddits, based on how many comments and submissions have been made on the Subreddits from 2012 to 2019. There are peaks in 2013, 2015 and a small one in 2018. This matches our searches for controversies in regards to GMO in the period 2012 to 2019. We found the spike in 2013 to potentially be due to the events such as the Monsanto lawsuits against farmers or the protests against GMO, the spike in 2015 to potentially be a change in the EU legislation, and finally the spike in 2018 to potentially be about the debate on the USDA GMO labelling standard.
These spikes on the six subreddits matches the edits in the Wikipedia page “Genetically Modified Food Controversies”. When comparing the previous timeline and this one, the first thing we notice is that the top spike is different. In the edits per year on the Wiki page, the big spike is 2013, while for the Activity on Reddit, the spike is 2015. This might be a result of the nature of the activity, as edits become more needed in the beginning and less and less needed later on, but social discourse about hot topics is always something people pursue on online platforms. The steady decrease in activity on Wikipedia versus a more hectic development on Reddit can also be explained by the fact that edits become redundant with time. One other reason might be that with time, there is more consensus in the scientific community on the topic of GMO’s, thus a website like Wikipedia, one based on verified facts, needs and generates less activity over time due to that.
Through the data, we have gathered and the methods we have used to process it, we can now begin to elaborate on information we can draw from this.
We can conclusively point out similarities between the two cortex maps in terms of the semantic themes addressed and their placement on the map. This could mean that there is a general consensus of the questions being asked and the facts that are available, both within expert groups and lay people. While still paying attention and respect to the fact that we consider Wikipedia editors to have more tangible expertise in the field, while we consider Reddit users to speak the voice of the general public. These themes entail topics such as human health, GM food and science, which are topics that lead us to believe that the underlying theme here would be science, while human health and GM food are topics that compliment and relate to each other frequently, but share their foundation in science.
In terms of the timelines showcased in this blog post, we found similarities within the activity in edits on Wikipedia posts and Reddit user engagements. An interesting observation to draw on when comparing these is the fact that activity increases in line with real-life events, such as political legislation changes, new scientific findings or social occurrences that relate to the theme of genetic modification in food production.
Additionally, when making networks of our data, we could see clusters where some authors discuss certain topics between them more than other topics.
The Monsanto topic is separated from the rest, showing us that the authors discussing Monsanto are not very active in the rest of the GMO discussions. Digging deeper into the map by reading some of the submissions in the clusters, we see that the clusters represent the different subreddits we have chosen for the analysis. We also see that some authors are very active within a certain subreddit and others are more active across.