Publishing Citizen Science Data: lessons from the Hong Kong Jellyfish Project

Scott C Edmunds
The CitizenScience.Asia Journal
7 min readMay 22, 2024

Hong Kong based Citizen Science project demonstrates how to follow best practice in data sharing by publishing a dataset of jellyfish sightings collected by citizen scientists from 2021 through 2023 within Hong Kong waters. This post hopefully provides some technical insight into how other projects can work to get their data permanently archived and research ready to benefit scientific research.

Problems with scientific data collection are known to be hindering efforts to halt mass extinction and biodiversity loss. Scientists require this data to produce accurate models, create policies based on these model to address this loss, and also to determine and stop those who are responsible. The GBIF (Global Biodiversity Framework) platform is the go-to home for this type of data, massively growing since its setup by OECD in 1999 to now hosting close to 3 billion biodiversity records. Because it is scalable Citizen science has become GBIF’s biggest source of data, particularly citizen science-derived data from the massively popular smartphone app-driven eBird and iNaturalist projects. These provide large volumes of data to GBIF but are also introducing biases in data types collected (birds are very much emphasized for example). There are also some smaller and mid-sized citizen science projects sharing their biodiversity data in GBIF but more of these are needed to fill persisting regional and taxonomic biodiversity gaps. And there are challenges in teaching the people working in these many projects best practice in biodiversity informatics to be able to submit using their very precise data standards.

Carrying out a beach survey for jellyfish in Hong Kong

Targeted data publishing by scientific journals has been an approach that has worked to fill some of these data gaps. This includes a few examples of as getting Citizen Science data into GBIF such as the Mosquito Alert project from Spain (but including data from Hong Kong school projects), TurtleSpot Taiwan, and the Kissing Bugs and Chagas disease in the United States Community Science Program in North America.

To take this approach wider some scientific journals have started to work with Citizen Science projects directly, and the first fruits of this have just been published by the Hong Kong Jellyfish project. It’s been noted that Citizen science is biased toward vertebrates (animals with backbones) and land-based ecosystems as well as toward particular species groups that are easier to observe or have big communities of hobbyists spotting them such as birds and butterflies. Bottom of the heap then are marine creatures and squishy invertebrates, and this project is a perfect example filling these gaps of under-represented taxa, and parts of the world.

To explain more in this interview and video founder of the Hong Kong Jellyfish Project John Terenzini gives some insight into the process of working with data experts to curate and archive this precious biodiversity data for future generations.

1. Tell us a bit about Hong Kong Jellyfish Project. Why did you feel getting citizens involved in surveying jellyfish in Hong Kong was useful?

Citizen science has been shown to be effective in monitoring biodiversity at broad geographic and temporal scales. Because jellyfish occurrences can be infrequent or irregular, having “many eyes” looking for them across the breadth of Hong Kong is the most efficient way of discovering what jellyfish we have in local waters. Also due to the difficulties in conducting marine research, from equipment needs to higher costs than terrestrial research, using citizen scientists’ observations is a cost-effective method, with a low barrier to participation due to the prevalence of smartphones used by the general public.

2. Gathering data from this wider range of sources, are there any new findings that have come out of it?

It has been really exciting to have citizen scientists share their discoveries with the Hong Kong Jellyfish Project. People’s curiosity about jellyfish really drives the whole project and through their photographs and reports, the HKJP has been able to publish several new species records for Hong Kong, including new two new species records of box jellyfish to complement the new jellyfish species discovery in Mai Po by different researchers. A forthcoming paper by the HKJP will summarize the far more expansive jellyfish diversity of Hong Kong than previously known, using citizen scientists’ reports and by reviewing the scientific literature.

3. You’ve collected data from several different sources (your website, inaturalist, and social media or email submissions), so what have been the challenges in bringing all this data together?

Due to the different nature of each of these data sources, the available information provided may be different and need to be organized in a similar manner. On the website, there is a form to use detailing exactly what information is requested (time, date, location, species, etc.), however on social media or through email, an observer may provide only one or two pieces of this information and requests for more information may be needed. On the website, the form is in English and has been translated to Traditional Chinese, allowing observers to use whichever language they feel most comfortable with. I hope this lowers any barriers to participation by making people comfortable sharing their observations in the language they prefer. However, as my main language is English, any observations in Traditional Chinese need to be translated, especially if there are additional comments.

Once data is collected, hopefully with photos and/or videos, it needs to be compiled into a similar format for analysis and any gaps in the data (i.e., missing location) need to be addressed if possible. Jellyfish need to be identified to the lowest taxonomic level, preferably species, using the available photos/videos and information provided. Only after the dataset has been compiled and all gaps addressed, can the data be used for analysis. Individual observations of new species records, for example, may require in-depth research into the existing literature or online resources. So, a great deal of time and effort is required to compile the observations into a complete dataset.

4. This is your first time submitting data to GBIF, and the GigaDB curators and the GBIF Asia Regional Support Team have worked with you to get all of the data submitted there. How did you find this, and how much have you learnt in the process? And is it going to change the way you collect data in the future?

It has been quite a learning process to be a part of getting this data onto GBIF! A huge thank you to the GigaDB curators and the GBIF Support Team, everyone involved in learning how to best execute this process and get the final result publicly available. It certainly required a team of people to weave the many strands together, from what to put into the dataset, to correctly formatting it, to getting it onto GBIF. There were so many things I did not know at the start of this process and learned from everyone involved. Knowing what is required to get data onto GBIF, especially from the different information received from the data sources you mention above has prompted a rethink of how the data can be collected and compiled. I hope to implement these in future data processing.

The Hong Kong Jellyfish Project data displayed in the GBIF map

5. How do you hope people will use this data? What sort of scientific questions does it help answer?

Jellyfish in general are an under-studied group of organisms and it would be great to know that my data is making a small contribution to illuminating a small part of our corner of the world. As the importance of jellyfish is increasingly recognized across the marine realm, it would be great to see this type of data used to affect not only our scientific understanding of local marine ecosystems, through jellyfish roles in food webs and ecosystem services, but also see this data used in educating the general public about these often-feared organisms as beautiful and essential components of the world’s oceans. It would be a key goal for this information to inform management practices in diverse sectors as fisheries, industry, tourism and recreation.

As is so often true in science, this data only engenders more questions. By discovering what jellyfish are present locally, it is only the beginning of much longer process of understanding what ecological roles they play, how they affect a broad swathe of the marine realm, and how their effects can be managed especially in the context of Hong Kong’s marine policy.

6. What’s next for the HKJP and what do you plan to do with future datasets?

As the project continues, I hope to continue to develop the bigger picture about what jellyfish are present in Hong Kong, improving on existing datasets. Countries around the South China Sea are known to have even greater jellyfish diversity, so we certainly do not know everything present in Hong Kong. I will keep promoting jellyfish to the general public to increase knowledge and acceptance of these fascinating creatures. I also hope to advocate for greater recognition of jellyfish and improved research opportunities from tertiary institutions and government bodies.

Screenshot of the data submission page on the Hong Kong Jellyfish Project website.

If you are interested in participating or learning more see the Hong Kong Jellyfish Project website: https://www.hkjellyfish.com/

And follow them on Facebook:

https://www.facebook.com/hkjellyfishproject/

References
Južnič-Zonta Ž et al. Mosquito alert: leveraging citizen science to create a GBIF mosquito occurrence dataset. GigaByte. 2022 May 30;2022:gigabyte54. https://doi.org/10.46471/gigabyte.54

Hoh DZ et al. (2022) A dataset of sea turtle occurrences around the Taiwan coast. Biodiversity Data Journal 10: e90196. https://doi.org/10.3897/BDJ.10.e90196

Terenzini J et al. Jellyfish in Hong Kong: a citizen science dataset. GigaByte. 2024. https://doi.org/10.46471/gigabyte.125

--

--

Scott C Edmunds
The CitizenScience.Asia Journal

Executive Editor of GigaScience, Citizen Science and Open Data nerd working at the BGI and based in Hong Kong.