Overhauling the Indian Statistical System for a post-Covid world: Going beyond Economics
In the recently concluded India Policy Forum held by the National Council of Applied Economic Research (New Delhi) in July panellists from across the board rued the lack of availability of quality data collected by government agencies in preventing researchers and policy makers from better understanding the pandemic- its spread across different sub-population groups, peoples’ responses to it in terms of their perception of the risk and the preventive measures undertaken, the extent of economic distress precipitated by the crisis and the governments’ (central and state) response to the crisis. In particular, as TV channels telecasted visuals of migrants left to fend for themselves in the cities, experts pointed out that the lack of a credible high- quality official data on migrants/ migration prevented an appropriate policy response to the pandemic. Similarly, granular data on health at neighbourhood level would have proved critical in shaping the public health response to the Covid crisis. However, in emphasizing the need for an institutional mechanism that provides reliable and timely statistics, academicians have focused narrowly on economic, demographic and health data. But it is imperative that we widen the ambit of the kinds of data needed at regular intervals for gaining deeper insights into the Indian society. The present crisis is an opportunity that we broaden the scope of the discussion on the types of data that we would want our official statistical system to gather in a post- Covid world.
To build my case, I have in the table below provided a snapshot of some of the different kinds of data that is available in the USA to be used by researchers, students and the general public with little or no restrictions. These data are routinely collected with public funding and housed in its federal agencies/ major universities. For each of the surveys, the table lists the objectives, year of inception and its institutional affiliation. I want to reiterate that these surveys have been highlighted only as illustrative examples. Federal statistical agencies such as the U.S. Census Bureau, Bureau of Labor Statistics, National Centre for Health Statistics engage in data collection other than that which are listed below and are in public domain. Likewise, I have taken the case of US, but other advanced countries such as those in Europe, have similarly rich data on sociological, attitudinal, psychological well- being and other aspects of their societies.
A look at the list indicates that none of these surveys focus on the economy alone rather variables on which data is collected spans the disciplines of economics, sociology, psychology, demography, health, child development, among others. This is sensible because much of the cutting edge of contemporary research is along the margins of various social science disciplines. In contrast, India simply does not have the rich tapestry of data that is available in other advanced countries; and is arguably ‘data poor’ in that sense too. Much of the official data in India focuses on parameters related to the economy (such as employment/ unemployment, consumption expenditure, agricultural output/ landholdings, manufacturing output); while the census gives us a bunch of data on demographics and socio- economic and health indicators and the National Family Health Survey collects along with demographic data useful information pertaining to women in the reproductive age group.
The next two related points to note is that the preponderance of longitudinal data and exclusive focus on sub- population groups in these surveys. The Panel Survey of Income Dynamics (PSID) is the longest running longitudinal survey in the world. In contrast, India Human Development Survey jointly conducted by National Council of Applied Economic Research and University of Maryland is the only nationally representative panel household survey in India and has had only two waves as yet with the first wave being in 2004- 05. Other examples of longitudinal data include annual ASER surveys and Young Lives India study. The Periodic Labour Force Survey, which now stands suspended, was the only official survey that had a panel element in its design with a rolling sample in urban areas. The argument is not to replace cross- sectional data with longitudinal data, but rather to add the latter in our data toolkit.
The National Achievement Surveys, which is housed in National Council of Educational Research and Training, is perhaps the only instance where the survey focuses on a particular group, viz., children in schools. NSO makes a distinction between four survey types- household surveys, enterprise surveys, survey of village facilities and that of land & livestock holdings; but none of them focus exclusively on a sub- section of the population. The lack of surveys in which the sample is limited to a sub- section of the population who are followed at regular intervals over a period of time means that we are losing out on the immense insights coming from such surveys. For example, if we had a nationally representative survey like the Fragile Families Study focussing exclusively on economically vulnerable households who had been tracked at regular intervals of 10 years and for whom we had collected a vast array of variables, we would have had a better sense of factors that aid families to move out or remain in poverty. Likewise, a panel data on migrants would have been very useful to respond to the migrant crisis in the initial months following lockdown. There are other sub- sections of population for whom panel data would help gain deeper insights into their lives such as tribal populations, linguistic minorities, young children and adolescents, population living in slums and so on. It is also important to note that such surveys are not completely absent in India (examples include the Young Lives India study, Indian Early Childhood Education Impact study¹) but they rely mostly on funding from private sources, are usually not nationally representative and not sustainable over long periods of time. Long running panel surveys of the kind listed in the table cannot depend on private funds alone and need funding from the state. The Panel Survey of Income Dynamics, for example, receives funding from among others the National Institute of Health and National Institute of Aging, both federal agencies of the US government.
In summary, commentators have expressed deep angst in recent years over the loss of credibility that the Indian statistical system has suffered. Much of the discussion has rightly been around the opaque manner in which the government handled ‘inconvenient’ data emanating from NSO, lack of availability of quality data in a regular and timely manner and concerns about privacy of data. Discussions have also been held on moving beyond PAPI (pen and paper interviews) to other modes of data collection incorporating technological advances such as CAPI (computer assisted personal interviewing) and telephonic interviews. However, a comprehensive overhaul of the official statistical system must also assess the different kinds of data that are needed for a nuanced understanding of the Indian society in a post- Covid world. In this the government can draw on the efforts by researchers from India and across the globe who have undertaken many insightful surveys in the country. Such surveys can be funded by the relevant ministry and hosted by institutions (academic or otherwise) across the country, for instance, as is done with the National Family Health Survey done under the aegis of the Ministry of Health and Family Welfare which is hosted by the International Institute of Population Studies. A relevant question arises here if government funding will not inevitably translate into government interference. This is, of course, a possibility and appropriate mechanism of accountability must be developed to guard against it because ultimately there is one and only one custodian of all data generated within India and that is the people of the country.
- Longitudinal Aging Study of India or LASI is another example. It is a nationally representative survey, and its hosted jointly by Harvard School of Public Health, Indian Institute of Population Studies and University of Southern California.