Mapping Energy Technology
A supply of energy is crucial for human demands, but how do we extract, manage and access it?
Feb 26, 2019
Atakan Kara, Corinna Voll, Rasmus Nissen
Energy use for human plans and programs have contributed to global climate change and related crises, which in turn are impacting human relation to energy. As such, not only are new methods of energy generation, distribution and storage emerging; but also bringing along with them new modes of technical innovation and social organization.
While some of these technologies prioritize quickly securing the energy supply for humans when faced with environmental adversity (such as fracking, nuclear power…), others focus on environmental regeneration and limiting human impact on nature (such as renewable energies). Furthermore, these developments expand and warp ways in which energy is socially, politically, economically organized. Struggles about prioritization, expertise and boundaries appear which make energy technology controversial.
We want to better understand the shape of this controversy. In order to do so, we investigate the landscape of energy technology on a public open source medium. Wikipedia provides us good starting point to dive into the different themes, conflicts and shifts related to energy technology. The results of our investigation and mapping compel us to pursue the debates taking place in this realm further. Through investigating Scopus, we delve deeper into the controversy and uncover the debates in the scientific community surrounding the currently most prominent field in energy technology: renewable energy. Within the field, the controversy surrounding the methods of distribution, generation and storage of energy proved interesting, as well as the questions of efficiency and reliability which were linked externally to ‘clean’ nuclear energy.
Protocol for data harvest from Wikipedia
In order to map our controversy, we have investigated the Wikipedia category “Energy technology”. By utilizing the programming interface of Wikipedia (i.e. API), we were able to retrieve category member pages (scraping) in preferred amount of subcategories (crawling). In total, we have harvested 946 pages from “Energy technology” and its subcategories. In turn, the resulting dataset was used for illustrating various relations in the media we are investigating.
We have built our first network by harvesting all the links from category member pages, which we could then use for representing referencing relations among the pages. Then we acquired complete texts of our category members, on which we then searched for certain keywords. We have also applied a semantic analysis script on the full texts, that highlights different phrases that frequently are used together. Lastly, we have retrieved revision histories of pages “Renewable energy” and “Nuclear power”, to produce a timeline based on the dataset.
Network of “Energy technology”
We have built our initial network by scraping all of the references from the Wikipedia pages we harvested, and then visualizing the interlinks each page share within the category and subcategories. We have chosen this network style, because we wanted to depict a broader picture of our topic that highlights certain clusters. Scraping all references on Wikipedia pages can end up in repetition and thereby exaggaration of certain links, such as the ‘Template’ section which is shared by many pages in a category. Such media effect can be overcome by scraping only references within the text of a page (in-text references). We have, however, limited the network to not include references outside our category and subcategory (members only) for a more reduced picture of sources with strong clutters. Through this, our network also became more clearly readable than it would have been with the very high amount of external sources.
The figure shows an image of a network. The network consists of nodes that represent pages on Wikipedia and edges that represents a hyperlink that connects two Wikipedia pages. The nodes have been ranked in size according to their degree of citation by other Wikipedia pages, hence a big node is cited more by other Wikipedia pages than a smaller node. The coloring of the nodes has been done according to how well groups of nodes statistically are separated. The coloring provides an indication of node clusters. We see several thematic clusters. With a little help of the statistic coloring, we can identify 7 clusters with themes related to nuclear energy (black), renewable energy (purple), gas (green), heat (blue), combustion (orange), shale oil (mint green) and hydrogen (red).
It is interesting to find the cluster of nuclear energy relatively close to renewable energy which leads us to the question if nuclear energy is being viewed as more positive regarding sustainability than earlier. A quick qualitative glance at Wikipedia pages reveals this to be true: nuclear power is considered to be a low greenhouse gass and carbon emitting energy type. However, while certain stakeholders advocate it to be a safe and sustainable energy source, others point to previous accidents that had tragic consequences for the environment and human populations. The hydrogen cluster close to the renewable energy also indicates an association. We have identified this relation in the energy input for harnessing hydrogen fuel, which can be more sustainable if renewable energies are employed.
Distributed generation in the landscape of energy technology
To be able to navigate in our network, we have augmented it with specific keywords from “Distributed generation”. Our initial interest lay in distributed generation, but since this focus was too narrow as a controversy, we broadened our search. After having set a more general perspective on energy technology, with our keywords, we tried to understand how our initial focus is located in the broader landscape of energy technology. In particular, we wanted to get a picture of how the different types of energy production within distributed generation are located in our work.
We have acquired complete texts of our category harvest, and then sought certain keywords within them. Then we have located these keywords within the clusters of our previous networks. Our aim was to identify where the decentralized energy generation and storage falls in the larger context of energy technology. To highlight various practices, we have selected headlines from the “Distributed generation” Wikipedia page: cogeneration, solar power, wind power, hydro power, waste-to-energy, energy storage.
Together with waste-to-energy and energy storage, cogeneration can be found on the page of distributed generation itself frequently. It has a clear centre in the page of cogeneration, whereas solar power has a few more pages in which it can be found. These are mainly renewable energy, space-based solar power and solar power. Wind power is equally dominant in the renewable energy page. Both solar and wind power are clearly located in the left area of the picture. Hydro power mainly focuses on the pages of the International Centre on Small Hydro Power and Micro hydro following power plant, and has less overall occurrences in the network. Waste-to-energy is the least represented in the network, and only visible in energy engineering, peaking power plant, distributed generation and Syngas. It is the one amongst all types of distributed energies which has a high occurrence in the cluster of gas. Energy storage can be found in grid energy storage and thermal energy storage mainly. It is however also spread out more widely. Clearly, all types of distributed energies have their centre in the cluster of renewable energies. The decentralised forms of energy production seem to be related to eco-conscious modes of energy production.
A network extracted through semantic analysis
As the next step, we wanted to seek for phrases that emerge in energy technology, by being frequently used together and in relation with one another. By retrieving such information, we can become privy to the internal terms and concepts in the field, as well as detecting connections that are more semantically based (namely, discursive).
For such a semantic analysis of our dataset, we have used the software CorText to recieve a list of terms from our body of texts in a ranked list. The first step is part-of-speech (PoS) tagging. In this step, the sentence structure is analysed and each term is given a word class (noun, adjective, verb, …). After having performed PoS, the process of chucking follows: a list of all noun phrases, combinations of nouns and adjectives in the body are built. Next, stemming arranges word combinations (e.g. water and gas & gas and water -> one category). Then, unithood and termhood are measured. High unithood is measured when a phrase constitutes a semantic unit (e.g. ‘renewable energy’, but not ‘many types’). High termhood is achieved when a phrase counts as a domain-specific concept. The program produces a similarity network. Terms that occur together are clustered. CorText uses an algorithm to filter edges; this prevents frequently appearing phrases to be in the center only because these are connected to many other phrases in the network. The filtering effect therefore just leaves the most significant co-occurrences. Last, cohesive thematic areas are detected in communities.
The image is separated in five clusters. While there is no strong periphery, it has a clear centre which revolves around terms associated with renewable energies. Even though CorText prevents frequently appearing phrases to be located in the center, we can still point a distinguished center in our network. This shows that even though these central clusters are frequently cooccuring with other the other clusters, they were significant enough to the clusters that they are enclosed by. This shows us that the debate around renewable energies is in close relation with a few other discourses, primarily linked to hydrogen (blue) and thermal power (yellow). Even though we saw many links between the theme of nuclear power and renewable energies in Gephi, the semantic analysis uncovers that the discourses are not involved directly with each other.
Timelines for Wikipedia pages: “Renewable energy” & “Nuclear power”
Below are two timelines we have based on the revision histories of “Renewable energy” and “Nuclear power” Wikipedia pages respectively. We have done this in order to identify points in Wikipedia history where interest in the subjects were significant, which we believe can point to certain stages of controversy.
We have formed these bump charts on RawGraphs covering a period of 10 years, illustrating the revision counts for each page. One thing that initially needs clarification is the amount of individual users being similar in number for both pages, however nuclear power has near 700 more revisions. This is not immediately apparent in the charts, where renewable energy has a fluctuating timeline and nuclear energy has merely two significant peak points. We have inspected these periods for both revision histories and they represent different reasons for revisions. For the renewable energy page, we can see that over years many different aspects of the field have been expanded in knowledge by Wikipedia users, such as but not limited to ‘wind power’, ‘solar power’, ‘hydro power’, ‘biomass’. Nuclear power, on the other hand, has increased in revisions in the last year but mainly connected to controversial issues within the topic, such as ‘development and early opposition to nuclear power’, ‘high level radioactive waste’ and ‘nuclear waste’, ‘renewable energy compared to nuclear power’. We have been surprised by how certain incidents, such as the Fukushima nuclear disaster in 2011 has not impacted the revisions of the page, which we initially had aimed to locate on the timelines. We chose those topics for timelines because we wanted to see if there were major events in the last years that have led to increased revision online; however, the peaks of revision (2010, 2019) for renewable energies and for nuclear power (2019) do not correlate with any public event we know of.
Mapping the scientific controversy
We are keen to focus more clearly on the prevalent aspects of the controversy in securing an energy supply for human demands and programs. Thus, we have narrowed our scope to one of the dominant field within energy technology based on our findings on Wikipedia: renewable energy. To understand our new focus better, we found interest in reviewing which subject areas are contributing to research, how these relate to each other, and what role different countries play in the academic community. We have chosen to focus on the controversy in academia, in order to base our further investigation on the scientific debate which surrounds it.
Protocol for data harvest from Scopus
For the second level of our controversy mapping, we have harvested our data from the Scopus database, which is the largest database of peer-reviewed literature. Scopus collects citations and abstracts for scientific articles, and arranges them based on different search criteria.
Initially, we have set the search parameters to cover [“renewable energy”] keyword between the years 1998–2018. This resulted in over 99000 articles, which we sorted from the most cited article to the least. This was done for the identification of the more authoritative articles in the scientific community. Then, we have harvested two different sets of data: one based on the different scientific fields and the other based on the different countries producing scientific content on renewable energy. The former was done through limiting our search results to the 16 top subject areas featuring articles on renewable energy, and scraping citation and bibliographic information, abstracts and keywords, and others from the top 4%. This has resulted in 6961 articles from various fields, representing more research fields. The latter dataset was harvested by limiting our search results to the 10 countries with most cited articles and scraping the same information from the top 10%, resulting in 6595 articles from various countries, represented more diversely. The subject area dataset was used for building one network which depicts the amounts of citation per article, and another which depicts the references amongst the articles. The country dataset was also used to build a network that depicts the references amongst the articles. Afterward, we have done a keyword search for “nuclear” in the dataset, in order to find articles in renewable energy relating to nuclear. The resulting 139 articles were used for running a semantic analysis in CorText, to identify the discursive network around renewable energy articles that feature the keyword “nuclear”. Furthermore, the country dataset was used for constructing a bump chart timeline on RawGraphs, representing each country’s article publications over a period of 20 years.
Citation-network
The cluster with the overall most cited articles is within chemical challenges of solar energy production, where the highest count of 5027 is achieved by the article with the title ‘Conjugated polymer-based organic solar cells’. This cluster is only in relation to that of solar and photovoltaic energy which makes it seem like a rather isolated class since it does not refer to any of the other clusters except one. The node size seems to be aggregated by a high number of citations outside the network. The dominant nodes in the grey clusters also represent a general dominance of the theme of solar energy within renewable energy research. The tendency to this source of energy production was also mirrored in the previously analyzed Wikipedia debate. The outstanding articles can further be interpreted as highly established knowledge, which serves as reference points for other articles from, as we can suggest from the picture, authors within the domain. It achieved this high count by citations from articles that were either cited less than the most 4% within their subject area, from articles that were not amongst the 16 most relevant subject areas, or most likely from articles within academia that do not appear on Scopus when the search term [“renewable energy”] is used. It is only one article within the dark grey cluster that cites another article of the cluster on solar photovoltaic energy. Research into the chemical properties seems a thriving research theme, however, it is growing rather separate from articles on other modes of production. The cluster with the second largest citations is found within distributed generation (lila). The two biggest nodes are articles focusing on grid integration and synchronization. Similarly, these seem to represent authoritative articles within distributed generation. In contrast to the grey cluster, these nodes score higher in regards to citations from articles within the network, while they likely also have received citations from outside the network. The cluster itself is connected to two other clusters; above, we find the cluster of energy storage in hydrogen. Below, it connects to a community of publications on microgrids, involving different kinds of optimization studies regarding energy storage, and single studies on photovoltaic, wind and thermal energy. Two of the other remarkable clusters, such as the light blue or green cluster, appear in the network with many articles, however, less cited. At the same time, these clusters seem to connect to at least three other clusters, which shows they are cross-referencing between different clusters to a higher degree.
Subject area-network based on coreference
In the Wikipedia-study we saw a close relationship between the keywords related to distributed generation and the cluster of renewable energy. We wanted to find out in what directions research on the field is moving, and to investigate how the landscape of renewable energy has evolved and still is evolving thematically.
The network has been built on the assumption that articles with shared references revolve around similar topics and thus give us insight into different research communities. The node sizing has been chosen to identify significant nodes when a topic of a cluster is assessed. The coloring is done to demarcate clusters and has been done with the statistical function embedded in the visualization program (modularity). A source of error in the network is that different citation styles can cause a coreference not to be recognized.
The topology of renewable energies
In this section, we will examine the topology of the thematic-network above. The graph shows a network with two clearly separated clusters; a small black/yellow in the top and a big multi-colored in the center.
The top cluster contains a black sub-cluster with the theme ”nanomaterials for water splitting”. The yellow sub-cluster centers around topics such as photocatalytic water splitting. We see both themes in both clusters which explains a relationship between them and we see a significant content of articles within the domain of material science. The overarching theme of the top cluster is water splitting techniques for energy storage in hydrogen. It is well separated because the articles in the cluster focus on the development of highly specialized material sciences that are not directly related to more applied research which forms a majority of articles in the main cluster. It is linked to the center cluster through a central article on nanostructures and photocatalytic water splitting. The edges are connected to an article about present electricity production and which primary energy forms preferably should be utilized. This article acts as a bridging element though primarily connected to the central cluster.
The central cluster revolves around current renewable energy sources and their relation to society. The main red sub-cluster is a mix of current renewable energy technologies and they share the theme of contemporary transition to renewable energies. The big dark green cluster shares the topic of macro-economic perspectives on renewable energies and many of the articles belong to the subject area of economics, econometrics, and finance. The light blue cluster is about optimization of methods for renewable energy. The light green and the orange cluster are focusing on biofuels and the dark blue cluster centers around wind energy production in, especially Anglo-Saxon countries. The pink cluster shares a theme of smart grid optimization and the grey one is about heating technologies. Finally, the turquoise cluster is revolving around decision-making framework for renewable energy which surprisingly does not contain a single article from the subject area on economics, econometrics, and finance. A closer look reveals that papers on decision-making are based on multiple criteria such as technical, economic, environmental and social impacts. This could indicate an issue about decision-parameters, while some focus on cost-benefit-analysis and others resort to multi-dimensional criteria analysis. Another interesting issue within the red transition cluster is the presence of articles on new types of economies where hydrogen-economies are discussed as part of the transition to renewable energies, which indicates research that challenges the current dominant economic system. On the other hand, the dark green cluster on macro-economic perspectives seems to have a different view, which can be referred from its remote position in the central cluster.
Country-network based on coreference
After having analyzed the different thematic clusters, we are now interested in seeing how these play out on a country level. Some clear country-specific clusters can be identified, while other countries seem not to cluster as clearly. The US seems to be present in several smaller clusters that appear as offshoots of the main big cluster, for example at the very top of the picture. This cluster represents articles within wind power and storage systems. The overall presence in the network is not surprising since the US has also been shown to be the country with the highest overall production (see bar chart below). Both the US and China seem to dominate the far right cluster (“USA 2&3”; “China 1”). When comparing to the subject area-network, it becomes clear that the US dominates research within Photocatalytic water splitting, together with a few less Chinese articles. Both China and the US are very active in research in nanomaterials for water splitting. China is furthermore present in the various parts of the main cluster. A demarcated cluster dominated by China can be found in “China 2” and deals with solar power storage. India can be clearly located at the center of the big main cluster; here, the articles focus on hybrid energy solutions and often deal with an Indian context. Germany, as the fourth most represented country, is scattered throughout the network and clustering can most clearly be seen in “Germany 1”. This part deals with public policy. In order to review whether different languages might influence reference to same-country texts, we have investigated the language of the articles within the network. Language barriers do not seem to influence coreference remarkably, since the language of 98.29% of articles was English, while Chinese only accounted for 1.61%, and German to 0.05%.
Network extracted through semantic analysis
Above is a network built on a semantic analysis. In our database based on different subject areas, we searched for the keyword “nuclear”. Our previous Wikipedia analysis pointed to a host of relations between the clusters of renewable energy and nuclear energy, based on the references these categories make to one another. We were curious as to what relations in current energy debate link nuclear energy to renewables, therefore we wanted to reveal the kinds of noun phrases that are used in the articles that mention “nuclear”. In the index keywords of the resulting 139 articles, we have run a Cortext semantic analysis and extracted 300 phrases which co-occur frequently together and have obtained the above-depicted network. This analysis proved thematically enlightening and allowed us to identify certain aspects of the debate.
Investigating our database qualitatively reveals that these articles link “nuclear” to an array of discussions surrounding renewable energy. As illustrated in the red cluster, articles that feature nuclear power, frequently articulate the non-economic viability of fossil fuel use and gas emissions. Glancing through some of these articles abstracts shows that they champion nuclear power as a replacement in both economic and environmental terms. Such articulations are furthered on the right side of the network. The light green cluster centers around the geographic requirements for renewable energy types, such as a supply of sunlight and/or wind, how these cannot always be steady supplies, and whether they are reliable. These concerns in the articles are once again articulated in relation to nuclear power, as less geographically specific and more reliable (there are atoms everywhere). The light blue cluster focuses on the efficiency side of nuclear power as a renewable energy resource and highlights “many jobs” as linked with the other phrases in the cluster. The article itself mentions the projected decrease in jobs in coal and natural gas industries, whereas an increased employment scheme for the future is based around renewable energies, carbon capture and nuclear energy. This can be linked to the yellow cluster distantly, which features phrases on the chemical carbon cycle, which would be a human-instigated photosynthesis for the capturing of carbon in the environment. The article mentions that as well as decreasing the carbon levels in the ecosystem, this anthropogenic cycle can be supplied with renewable energy resources and nuclear energy.
Overall, in our database nuclear energy is represented in the scientific articles on renewable energy either as a fast track companion to the renewable path, or a more efficient or reliable method that could replace it (due to advances in technology that secure nuclear power plants from fallouts).
Timeline of country publications on renewable energy
The biggest increase in the cited articles in the field is between 2006–2008, which could represent the initial rise of interest in the scientific community towards renewable energy research. At the time leading up to this rise, USA, Germany, and UK can be seen as being most actively publishing, including India. However, after 2007, China, a previously not as high ranking country, becomes a frontrunner in publications. Between 2008 and 2016, USA and China lead the chart, whereas the remaining countries have fluctuating timelines, such as India, Italy and UK. Based on the areas different countries cover, it is also visible that USA and China at times have publication counts that more than double other countries’ individual counts. This is most likely based on demographics and scientific communities in different countries. As can be seen in the Bar Chart (fig. XX), the US and China are the two leading countries in the field of renewable energies between 1998–2018. 2014 onwards, China has overtaken USA’s spot as the country with most articles published, which is a surprising change, considering that China had been the least publishing country back in 2002. The timeline depicts that China has become more visible in the scientific research on renewable energy, which could represent them gaining more of on authority on the issue. Meanwhile, the US continuously decreases in publications, around 2017 falling back to the third place after China and India. As more recent articles are not cited as much, the overall decrease in highly cited articles after 2014 in the timeline can be explained through such a media effect. In the endpoint of our timeline in 2018, we can observe China, Germany, India and Australia, all formerly ranking lower than USA, at the highest ranks of our chart.
Conclusion
In this final section, the main findings are summarized starting with the broad Wikipedia-study on energy technologies to the narrowed down study on renewable energy within academia.
On Wikipedia, we saw the cluster of renewable energy far away from the cluster of combustion and shale oil but closely related to the cluster of heat and hydrogen which point towards issues with storage of renewable energy. Furthermore, nuclear was in the vicinity of the cluster of renewable energy which turned out to have shared concerns with emission.
From the keyword search on topics closely related to distributed generation, we saw that they are primarily located in the cluster of renewable energy. The locus of the keyword energy storage is located close to the hydrogen cluster but also associated with heat and has a note in the bridging note to the nuclear cluster.
The bump chart shows a massive increase in article production on renewable energies in 2007. It is also clear that China produces more articles than the USA on the topic since 2014. Maybe also India from 2016. This can be biased by the media-effect mentioned in the related chapter above.
In the co-reference network on subject areas from Scopus, we found that the development of highly specialized material sciences is not directly related to more applied research which forms a majority of articles in the main cluster. In the same network, we find that the transition to renewable energies evokes discussion on economic systems and criteria for decision-making.
In order to understand which knowledge claims are made within the field of renewable energy, we needed to find out what knowledge counts as authoritative. Research into the chemical properties of solar power seems a thriving research theme, however, it is growing rather separate from articles on other modes of production. Grid integration and synchronization further seem to provide authoritative articles within distributed generation but were similarly rather unrelated to other research clusters. Linked to the previous findings, this shows us that the field of renewable energy research seems to dissolve into different branches and is far from being a homogenous field. This thematic heterogeneity is also mirrored in the investigation on a country level. USA and China seem to prioritize research with a focus on researching the material level of renewable energy, whereas India seems to prioritize research on applied technologies. Academics based in different countries, therefore pull the debate into different directions. Clearly, country characteristics need to be considered here, since they frame the context of research. A developing country such as India might be more interested in the application of different types of renewable energy generation to secure energy demand. Interest in renewable energy has not always been steady throughout time — the timeline showed that the topic is at times more or less pursued in academic research. A combination of social, political and economic factors come into play here, shaping its direction.
Referring back to our Wikipedia debate, having investigated the internal controversy surrounding the generation and storage of renewable energies, it is compelling to investigate the external interaction the field has with nuclear power. The semantic analysis unravels that nuclear energy is increasingly articulated as a clean energy type that could accompany renewable methods. Furthermore, nuclear power is heralded as a more economical, efficient and reliable method, compared to the geographically specific methods of energy generation, such as wind and solar power. We can also detect nuclear power’s role in the emerging industries, such as chemical carbon cycle; as well as in social settings, for instance in job creation and public acceptance.
We saw different framings of energy technology foregrounding, which pave the way to versatile visions of the future. Investigating how researchers paint the picture of this transition gave us a glimpse of the diversity within the contested field of renewable energies. The decision now remains not only in the hands of those engaged in framing the issue in an academic context, but also depends on the engagement of voices that question and reframe established visions of the present and future.