Measuring Migration: Old Challenges, New Opportunities

UC Berkeley D-Lab
7 min readJan 22, 2024

--

by Suraj Nair, D-Lab Data Science Fellow

In the 21st century, patterns of human migration are being reshaped by various forces, including economic opportunity, conflict, and anthropogenic climate change. There is a pressing demand for reliable and timely data on human migration flows, which enable a better understanding of these (and other) various drivers of migration. Indeed, this is key to designing and implementing better policies which promote human development. Unfortunately, traditional methods of measuring migration — such as government censuses, or sample surveys — have struggled to keep pace.

Take the example of India, the origin of more international migrants than any other country, and also home to half a billion internal migrants. A key source of internal migration data is the Census of India, which was last conducted in 2011. The last available nationally representative statistical survey focused on migration was conducted in 2007–2008. Essentially, there was no official data on internal migration in India for over a decade, till the government added a migration module to the 4th round of the Period Labor Force Survey (PLFS), in 2020–2021!

In this blog post, I examine persistent challenges to measuring migration. I then focus on the opportunities offered by large digital trace datasets to address and overcome some of these challenges, and highlight their use in my ongoing research. I conclude by summarizing some of the drawbacks and disadvantages of these data.

The spectrum of human migration

At the heart of the measurement challenge is the diversity of migration types. Human migration encompasses a spectrum ranging from voluntary relocation for economic purposes, to forced displacement due to conflict, environmental factors, or the resulting deterioration in economic conditions. Defining migration thus becomes a delicate task when attempting to capture the essence of these diverse experiences. Moreover, the line between temporary movements and permanent resettlements is often blurry. These nuances become particularly important when examining migration and human displacement in the context of climate change, for example. By at least one account, our inability to distinguish permanent moves from temporary ones (especially in aggregate data) contributes to conflicting evidence on the relationship between climate and migration (Bohra-Mishra et al., 2014).

Many countries rely on the UN definition of long-term international migration, which focuses on individuals who change their country of usual residence for a period of 12 months or more. Defining internal migration remains far more contentious, and different countries and organizations adopt varying criteria to define and measure internal migration, which often makes cross-national or global comparisons less meaningful. In the Census of India, for example, any citizen who resides in a place different to their place of birth is classified as a migrant — as a result, India had 454 million internal migrants in 2011 — nearly 40% of the population. There is thus a dire need to improve the availability of data on internal migration. It does not help that there is a disproportionate focus on international migration, especially in popular media. In reality most migration occurs within borders; by one estimate, there were 763 million internal migrants (globally) in 2015, a number which has likely grown in years since. In contrast, there were around 281 million international migrants in 2020 (UN DESA, 2020).

Migration flows remain challenging to measure

A separate measurement challenge pertains to the differences between migration flows and stocks. Migration stocks capture the total number of migrants present in a specific location at a particular point in time. Data on migrant stocks are made available through government censuses, various administrative datasets, and national statistical surveys. The key thing to note is that these data offer a static perspective of migration. Migration flows, on the other hand, represent a dynamic process of people moving from one location to another over a specific period. While data on flows are most useful in providing insight into various socio-economic drivers and factors associated with migration, these data are often the hardest to collect.

In recent years, numerous studies have relied on data on international migration flow provided by the Organization for Economic Co-operation and Development (OECD), or the United Nations (UN). However, these data primarily focus on developed nations, and do not have good coverage in Sub-Saharan Africa, or South Asia. A common fix, in many cases, is for researchers to infer migration flows from data on migrant stocks. Recent evidence (see Berlemann et al. 2021) suggests that empirical results which rely on inferred flow data are likely highly sensitive to the choice of statistical method used to convert the stock data to flows. Once again, this highlights the shortcomings and consequences of persistent data gaps.

Digital data for migration

Over the past decade, the rapid proliferation of digital sensors has created tremendous opportunities to study human behavior. In particular, these non-traditional, digital “big” data sources hold unique potential in resource constrained environments, where traditional sources of quantitative data are unavailable, outdated, or unreliable. Indeed, numerous studies highlight the use of these data to study patterns of human mobility, displacement, and migration (Blumenstock, 2012; Chi et al. 2020; Tai et al, 2022). Much of my ongoing research expands on these ideas, and explores the use of these data to measure and study patterns of migration at new granularity and scale.

In one ongoing project (joint work with Xiao Hui Tai, Shikhar Mehra, and Joshua Blumenstock), we demonstrate the use of mobile phone data in measuring patterns of seasonal labor migration — a key feature of developing economies (Banerjee and Duflo, 2007) — in Afghanistan. This work highlights distinct advantages offered by these data. First, they are available at high frequency, allowing for timely and repeated measurement, and enabling us to study seasonal migration patterns over a seven year period. Second, it enables the measurement of an important pattern of migration in a setting where barriers to traditional forms of measurement exist. Finally, it highlights the use of such data to measure migration flows at country scale, at a fraction of the cost of physical surveys, or other methods.

A careful and considered use of “Big’’ Data

It is important to note that the use of these large datasets in migration research can make entire populations visible in a way that was rarely possible before. However, these data are far from “objective’’ in any way or form. To claim that they are would be to promote a “statistical imaginary’’ (boyd, 2021). Just as the use of these datasets make certain people and populations visible and legible, they also marginalize, exclude, and render others invisible (Bowker and Starr, 2000). For example, in low-income settings, call detail records and social media data are likely to systematically exclude the poorest of the poor — often the true population of interest in the context of economic development. As a result, measures of mobility from smartphones, for example, are unlikely to be representative of the movement patterns of the population at large (Milusheva et al, 2021). Similarly, de-identifying and “pseudonymization” are often insufficient to protect privacy (de Montjoye et al. 2018), and there are likely costs (in addition to the many benefits) to making certain patterns of movement visible to decision makers and planners (Taylor, 2023). It is therefore crucial that we acknowledge and emphasize the need for a careful and considered use of these data.

References

  1. Banerjee, Abhijit, Esther Duflo, Rachel Glennerster, and Cynthia Kinnan, “The Miracle of Microfinance? Evidence from a Randomized Evaluation,” American Economic Journal: Applied Economics, January 2015, 7 (1), 22–53.
  2. Berlemann, Michael and Max Friedrich Steinhardt, “Climate Change, Natural Disasters, and Migration — a Survey of the Empirical Evidence,” CESifo Economic Studies, December 2017, 63 (4), 353–385.
  3. Blumenstock, Joshua E., “Inferring patterns of internal migration from mobile phone call records: evidence from Rwanda”, Information Technology for Development, 2012, 18:2, 107–125.
  4. Bohra-Mishra, Pratikshya, Michael Oppenheimer, and Solomon M. Hsiang, “Nonlinear permanent migration response to climatic variations but minimal response to disasters,” Proceedings of the National Academy of Sciences, July 2014, 111 (27), 9780–9785. Publisher: Proceedings of the National Academy of Sciences.
  5. Chi, Guanghua, Fengyang Lin, Guangqing Chi, and Joshua E. Blumenstock, “A general approach to detecting migration events in digital trace data,” PLOS ONE, October 2020, 15 (10), e0239408. Publisher: Public Library of Science.
  6. danah boyd, “Statistical Imaginaries,” December 2021.
  7. de Montjoye, Yves-Alexandre, S´ebastien Gambs, Vincent Blondel, Geoffrey Canright, Nicolas de Cordes, S´ebastien Deletaille, Kenth Engø-Monsen, Manuel GarciaHerranz, Jake Kendall, Cameron Kerry, Gautier Krings, Emmanuel Letouz´e, Miguel Luengo-Oroz, Nuria Oliver, Luc Rocher, Alex Rutherford, Zbigniew Smoreda, Jessica Steele, Erik Wetter, Alex “Sandy” Pentland, and Linus Bengtsson, “On the privacy-conscientious use of mobile phone data,” Scientific Data, December 2018, 5 (1), 180286. Number: 1 Publisher: Nature Publishing Group.
  8. Bowker, Geoffrey C. and Susan Leigh Star, Sorting Things Out: Classification and Its Consequences, The MIT Press, August 2000.
  9. Milusheva, Sveta, Daniel Bjorkegren, and Leonardo Viotti, “Assessing Bias in Smartphone Mobility Estimates in Low Income Countries,” in “Proceedings of the 4th ACM SIGCAS Conference on Computing and Sustainable Societies” COMPASS ’21 Association for Computing Machinery New York, NY, USA September 2021, pp. 364–378.
  10. Tai, X.H., Mehra, S. & Blumenstock, J.E. Mobile phone data reveal the effects of violence on internal displacement in Afghanistan. Nat Hum Behav 6, 624–634 (2022). https://doi-org.libproxy.berkeley.edu/10.1038/s41562-022-01336-4
  11. Taylor, Linnet, “Data Justice, Computational Social Science and Policy,” in Eleonora Bertoni, Matteo Fontana, Lorenzo Gabrielli, Serena Signorelli, and Michele Vespe, eds., Handbook of Computational Social Science for Policy, Cham: Springer International Publishing, 2023, pp. 41– 56.

--

--

UC Berkeley D-Lab
UC Berkeley D-Lab

Written by UC Berkeley D-Lab

D-Lab helps UC Berkeley community members move forward with world-class research in data intensive social science and humanities.

No responses yet