Harnessing Legislative Data to Support Child Protection Efforts in India

In 2023, as a part of the Patrick J. McGovern Foundation’s (PJMF) ‘Data to Safeguard Human Rights Accelerator’ cohort, Enfold Proactive Health Trust, jointly with CivicDataLab, embarked on an exploratory study aimed at understanding the implementation of two child protection legislations addressing child labour and child marriage with the help of advanced data tools. We sought to leverage the potential of e-Courts data in generating actionable evidence while also testing the efficacy of Natural Language Processing systems to scale analysis.

by Anindita Pattanayak, Swagata Raha, Shruthi Ramakrishnan, and Shivangi Puri

Child protection efforts in India seeking to rely on evidence-informed advocacy and intervention face a critical challenge due to the significant data void on the implementation of child protection laws. The eCourts portal, managed by the e-Committee under the Supreme Court of India, offers information about ongoing and disposed judicial proceedings. Apart from copies of judicial orders, the portal indicates provisions and legislations under which the case is filed, case type, judge designation, judicial forum, dates of registration, hearings and disposal, and pendency and disposal nature. Though the portal provides public access to a wealth of data, analysis of large amounts of data presents a prohibitively time-consuming and resource-intensive challenge.

Expanded analysis of court data

Participation in the Accelerator enabled the team to test and deploy a robust data engineering pipeline to be able to study large data sets, which was not possible earlier due to reliance solely on manual identification and curation of cases. Enfold, in its previous studies on the implementation of child protection laws, was focused on analyzing a sample set of judgments that were identified through a manual search of the eCourts portal. Each identified judgment was manually downloaded, and relevant information from the judgment was stored and processed using a spreadsheet-based software for analysis. With the aid of technology in the PJMF Accelerator program, the scope of our analysis was significantly enhanced. The metadata on the e-Courts portal could now be analysed in addition to judgments, allowing access to information on about 10,800 cases while previously restricted to 300–1700 sample cases.

The tools also saved time and labour in extracting relevant judgments through pre-defined criteria. Further, we were able to test the current state-of-the-art NLP models trained on Indian legal text to identify and extract relevant portions of the judgment text that were needed for examining specific research questions. We verified the results manually to assess the accuracy and relevance of the text extracted using these models. To further improve the accuracy, we would like to test the efficacy of large language models, which might be better suited to process and contextualize text with vernacular phrases and region-specific use of English.

Enabling disaggregated datasets and analysis of the interplay of laws

Analysis of the metadata of cases has been instrumental in discerning the number of cases registered under child marriage and child labour laws. Government crime data published annually does not provide a full picture of the number of criminal trials that arise out of offences under these laws because it follows the “Primary Offence Rule,” which dictates that if a criminal incident involves several offences, only the crime with the most severe punishment is counted for official statistics. Owing to this, crimes that constitute minor offences when clubbed with serious offences are not reflected in the data on offences under laws regulating child labour and child marriage.

The analysis of metadata revealed a much higher figure for the number of criminal trials under these legislations. For example, according to government crime data, 1,329 incidents of crime under child labour law were reported between 2015 and 2022 in the six selected states. According to court metadata, there were 9,193 ongoing or disposed criminal trials under child labour law in the same time period. The extraction of this data has made it possible to gauge the volume of cases of child labour in which criminal law is set into motion and debunk the misconception of poor utilisation of the law.

Metadata analysis has also helped us map the interaction of child labour laws with other criminal laws and track trends in disposal and pendency. It has also enabled district and State-wise comparisons and identification of geographical areas with high volume of cases of child labour in the judicial system. This can also be used to unpack the link between the incidence of child labour in these regions and the utilisation of available laws to regulate child labour. Further, trends in disposal and pendency can help identify districts that warrant examination due to high pendency levels, which can lead to necessary interventions such as establishing additional courts or increased resources. Such insights can impact strategic interventions to promote the monitoring of the implementation of human rights by providing valuable evidence to inform advocacy, policy formulation, and capacity-building training of institutional and civil society actors.

The PJMF accelerator program enriched the analysis and will impact advocacy and capacity building that Enfold undertakes on these issues, helping shape the methodology of future studies. The model of studying both judgments and metadata to glean comprehensive and context-specific insights on the legal landscape concerning a human rights issue can be adopted by others.

Looking Forward: Improving public data for transparency and accountability

While this exploratory study has demonstrated the immense potential of the eCourts portal as a rich data source, significant challenges were encountered in analysing the data and readying it to make it fit for analysis. The suboptimal manner in which data is presented and errors in entry hindered analysis and raised concerns about the reliability of the available data. We found widespread inconsistencies in references, a lack of standardized entry formats, and the use of varying terminologies across different regions of the country that made analysis of eCourts data time-consuming and challenging. Disparate data points had to be meticulously distilled into standardized categories before meaningful analysis could commence and conclusive insights could be derived. This experience informed our approach to processing the metadata, where our analysis focused on assessing the quality of metadata itself in addition to assessing the application of the law. We mapped the range of errors, lack of uniformity, and inconsistencies in e-Courts data, including by measuring the rate of errors in some instances.

The tools used through the aid of PJMF helped develop viable methodologies to clean eCourts metadata and glean relevant insights from it despite the errors and discrepancies present. For Enfold, these methodologies present a framework for future analysis from the data curated through eCourts. At the same time, the process can be employed by other organizations to analyze other laws or subjects from eCourts data, to further their particular goals in the field of human rights.

Court data from the eCourts platform is a valuable “public good” and can be an important tool in ensuring accountability, reshaping the understanding of the implementation of laws, and effectively influencing reform and policy. This will become possible only if the data entry is streamlined and automated where possible, data is validated, entries are standardized and easily understandable, and standardized training is offered to data entry personnel nationwide to enhance the reliability and accuracy of available data. This exercise has helped CivicDataLab and Enfold develop specific recommendations for the eCourts platform to strengthen the transparency and accessibility of public data. This will likely improve insights into all laws safeguarding human rights beyond child protection laws. To learn more, read our full insights report.

--

--

The Patrick J. McGovern Foundation
Patrick J. McGovern Foundation

Inviting conversations on how AI and data solutions create a thriving, equitable, and sustainable future for all.