Meet the researcher: Big data, conservation and adventures in mathematics

QUT Science & Engineering
The LABS
Published in
7 min readApr 17, 2020

Research in big data has made great strides in the last few years since the terminology first entered our cultural lexicon. The application of data analysis and statistical modelling to every aspect of our lives is now standard practice. But researchers are coming up against new issues all the time. For example, how do we deal with new types of data, such as virtual reality? And even if we have a lot of data, how good are our predictions or insights?

Distinguished Professor Kerrie Mengersen from QUT has dedicated her career to “breaking open” large and diverse sets of data to help learn about problems in health, the environment, society and industry. She has worked in fields with incredible diversity, from building a national Australian cancer database to VR modelling Peruvian jaguar habitat for international conservation projects.

Distinguished Professor Kerrie Mengersen on-site in Peru collecting data on jaguar habitat.

“Because people are now talking about big data, they can see its opportunities. But big data is not useful by itself: it need to be analysed. This is where statisticians play a big role,” Mengersen said.

“There is a huge demand in almost every area in business, science and society for people with strong quantitative skills. Children interested in mathematics, statistics and computing have a bright future.”

When asked about where these opportunities for future statistical interrogation may lie, Mengersen was clear about the potential reach of data science as a discipline.

“Everything is data. Big data can apply to almost any situation that will touch a person’s life.”

What is big data?

Big data is large collections of data that can reveal of information about behaviours, patterns and trends in the world around us. It’s collected in all sorts of ways, from surveys and experiments, to phones and satellite data, to computer use and online behaviour.

“Nowadays we are all contributing to big data all the time, without even thinking about it,” Mengersen said.

“As a society, we have a lot of data at our disposal — and it’s increasing every day in every aspect of our lives and world.

“We can now try to harness information about these complex systems and understand or solve some of the world’s problems that seemed intractable before.”

The importance of ‘why’

Having large data sets is one thing, but it’s what we do with that information that really counts.

“Big data can be big in size but limited in terms of the information about what we’re interested in,” Mengersen explained.

“For example, we might be interested in patient pathways through the healthcare system and have a lot of information about individual aspects of that journey, but not how people actually track throughout the whole system. So you can have a lot of data and not enough information to learn about the thing you care about.

Statisticians can not only analyse the data you have, but also identify what additional data are needed to answer questions of interest. They can also inform about ‘intelligent data collection’, so you only collect the data you need.”

If data is the raw input, then information is the meaning that we derive from it.

“The data can show us trends and patterns, but the next step is interrogating that information to find out what’s behind those patterns and if any of the observed differences are real,” Mengersen explained.

“We have to look at the interrelationships in the data. We can break it open and see what stories it’s telling us, who are the agents, how they affect and are affected by the data. We then present those stories to decision makers and specialists in the field, and that guides changes in practice or improves knowledge.

“From there it’s a cycle of refinement. We can collect more data, interrogate it, and present it. By going through this feedback process multiple times, we can learn and change — that’s how we answer that question of ‘why’.”

Data, statistics and you

When we’re all contributing to big data all the time, one of the key considerations for data scientists is how to protect individual privacy.

“When we’re dealing with huge volumes of personal data, we have to be very cognizant of the fact that we’re dealing with people. And we’re also dealing with the law and ethics, so our projects need to take that into account too,” Mengersen said.

“People need to really become data-educated, and they need to understand what data they’re providing and what some of the consequences of that are. For example, our mobile phones provide a huge amount of data about us. This can be of immense use for city planners and emergency services for example, but it can also be used against us or for illegal purposes.

As Mengersen sees it, the basic education about data literacy is missing.

“We used to learn reading, writing and arithmetic. Now we need to learn about data, too.

“It’s also not just understanding the data that you’re putting out into the word and how it’s being used. It’s also important for all of us to be data literate so that we can make use of the data ourselves, and understand and participate in the decisions that are being made by others.”

Predictive data for human behaviour is complicated — for example, how can researchers account for very human behaviours like free will?

“There’s always variation, but we can predict to a point,” Mengersen said. “We can look at general signals and trends in human behaviour, and we can often quantify how people will behave in different ways around that general trend. Quantifying variation in predictions is as important as making the predictions themselves. And this isn’t only confined to human behaviour.

“Let’s look at an earthquake — you’d think that because it’s a geological phenomenon, we’d be able to predict it accurately. But there’s still always variation around what happens.

“However, the more we learn about a problem, the less variation there is around what we’re predicting, and the better our models and predictions become.”

Environment and conservation

Some of Mengersen’s most lauded work has been in data science and conservation. Over the last couple of years, she’s been developing virtual environments to gather new data from scientists, environmental managers and citizens. She and her colleagues have been using this new approach to help monitor the health of the Great Barrier Reef, identify koala habitat in Queensland, and collaborate with governments and peak environmental bodies in Peru to develop a protected habitat for the jaguar population.

Mengersen worked with environmental organisations and Indigenous groups to build a better understanding of jaguar behaviours and habitat.

A large part of the success of these projects has come from engaging the population in data collection itself.

“We’re not just asking people to be the data collectors, we are asking them to engage in the statistical modelling and analysis with us,” Mengersen said.

“This immersion and interaction is key for real-world applications of data analysis and modelling.”

Some of her previous data analysis work has investigated cheetah populations in southern Africa and orangutans in Borneo. While mathematics and statistics might not seem like the most direct pathway into conservation and environmentalism, Mengersen points out that we all have a role to play in protecting the world around us.

“Take deforestation, for example,” Mengersen said. “Often that’s happening because there’s an economic value on the land. So, we as researchers want to look further.

“Let’s quantify the value of the land from social, cultural and health perspectives of the people living there. Then decisions can be based on more than just money; they can also take into account people’s perception of the value of that land.”

Changing the perception of value can create a greater desire to protect habitat and environment, which Mengersen sees as key to conservation.

The application of maths

The work that Mengersen does is very diverse, ranging from the theoretical aspects of statistical modelling, through to developing new methods and models for complex problems and the computational methods to implement these models, and then applying these approaches to important problems in society and the environment.

“There’s a very common perception that maths is all solved and there’s nothing new to be discovered — that was also my perception when I first started university,” Mengersen said.

“But that’s not true. Maths and statistics are exciting research fields with many unsolved problems. And now there are so many more possibilities, especially looking at the new information sources and computational capacity we have today.

“It’s really opening up a whole new way of doing mathematics and statistics, and even the way that we think about modelling or describing systems.”

More recently, Mengersen and her colleagues have also brought the interactive data approach to a range of projects, including:

· Reef preservation — a citizen science project calling virtual reef divers to upload and classify coral reef images for scientific analysis

· cancer atlas — an interactive platform that allows people to access data on 20 different types of cancer, including geospatial incidence and survival rates

· food security — using satellite data to monitor crops, vegetation and deforestation.

The scale of the projects moves from localised to global, as Mengersen works with local organisations, state and national governments, and international agencies like the United Nations.

Creating visual, engaging platforms to investigate and work with information is key to making a difference.

“Allowing people to engage with this information in a way that’s visual and accessible starts those important conversations about why things are the way they are,” Mengersen said. “From that natural inquisitiveness, we can see advancement.”

For Mengersen, the opportunities with statistical science are endless.

“The great thing about being a statistician is that you can work in so many different areas. It’s always such an adventure.”

More information

Explore research at QUT’s Science and Engineering Faculty

Find out more about Distinguished Professor Kerrie Mengersen

--

--

QUT Science & Engineering
The LABS

Science, technology, engineering & mathematics (STEM) news, research, insights and events from QUT Science and Engineering Faculty. #qutstem