How good (or bad) is the DOH reporting COVID-19 data?

With confirmed COVID-19 cases in the Philippines surpassing 250,000 and with the Department of Health recently retracting their reported numbers, how is the agency faring in data reporting?

Scientia
Scientia
8 min readSep 15, 2020

--

News Feature| Jazryl Galarosa & Samantha Peniano

Graphics by Jazryl Galarosa

Every number counts. An increase in the number of cases is another person, another life affected. But reporting the numbers is not smooth sailing for the country.

Through the past months, the Department of Health (DOH) has been under fire for its data reporting and presentation. But how good (or bad) is the DOH actually doing according to our experts?

Why does data reporting matter?

Why is data reporting important anyway? For our experts, the reported data is key in making models that depict the coronavirus pandemic, helping us make projections of the number of cases for instance. Through these models, we can make recommendations to improve our pandemic response.

“Accurate, up-to-date, and honest data help in addressing the COVID-19 pandemic as support for policymaking about the virus,” said Dr. Peter Cayton, Associate Professor at the UP School of Statistics and a member of UP COVID-19 Pandemic Response Team.

In the Philippines, data on COVID-19 can be collected from the DOH Data Drop, the Philippine Statistics Authority, and local government units (LGUs). This data is cleaned and augmented, which includes creating variables such as the estimated date of onset to help in analysis later on, especially when there are cases with missing information.

Data on hand are then analyzed by mathematicians, statisticians, and epidemiologists who also work with social scientists, economists, and public health experts. Experts set up equations that reflect a certain situation by using the available data. Along with statistical analyses, they can show metrics or key quantities that describe the pandemic, such as the reproduction number (R) which is a measure of how many people on average an infected person will pass the virus on, a dispersion parameter (k) which highlights the difference in infection rates among different infected individuals, and doubling time which is the time it takes for cumulative cases, deaths, or recoveries to double.

Since the parameters of the models are based on data, the importance of accurate and up-to-date data cannot be understated. Among the important variables that DOH releases includes different dates related to the status of a patient (e.g. collection of swab samples, death, recovery). The agency also used to publish the date of report of removal (DateRepRem), which indicates when the case has been officially reported in the count of recovered or died cases.

“Timely, accurate data is key to achieving good models, and good models are those that can reflect the situation as close to reality as possible so that they can give equally good estimates of parameters, evaluations of control, and projections,” Dr. Carlene Perpetua Arceo, director of the Institute of Mathematics at UP Diliman, explained. Arceo also heads the Modeling and Applications research group of the institute.

There are different models that can be used for the COVID-19 pandemic, but the most common are time series analyses and the SIR and SEIR models.

Time series show the evolution of a variable’s value across time, for instance, the day-to-day rise of COVID-19 cases. Meanwhile, in SIR and SEIR models, the population is divided into compartments to represent those susceptible to (S), exposed to (E), infectious (I), and recovered (R) from COVID-19. Simulations are done to determine the projected number of people in each compartment and the rate at which people move between the compartments.

Infographics by Hanz Salvacion

Results from these models and other computations are then presented to the public. The UP COVID-19 Pandemic Response Team, a bioinformatics group doing simulations and analysis on the pandemic, publish their results on their hub website. Journalists and graphic artists also help in making the results more accessible to Filipinos.

How is DOH doing in data reporting?

So, how exactly is the DOH doing in reporting the data related to COVID-19? For our scientists, DOH has been trying their best, and making the data available is one of the things the agency has done right.

“DOH data reporting has improved significantly compared to [reporting from January up to March]. I also know that DOH is still exerting efforts to further improve their data management,” said Dr. Jomar Rabajante, professor at the Institute of Mathematical Sciences and Physics at UP Los Baños and also a member of UP COVID-19 Pandemic Response Team.

Cayton shared the same opinion, noting how DOH put up figures daily and that their officials are communicating with the public. Before August 7, data on hospital supplies, COVID-19 dedicated hospital capacity, case information, and daily testing aggregates from laboratories were part of the typical data available on the Data Drop.

Ever since, DOH expanded its hospital capacity data to accommodate non-COVID numbers, daily accumulated data of health care workers in quarantine or infected by COVID-19, and quarantine facility statistics.

Despite their progress and efforts in improving their data collection and management, DOH is still plagued with issues in their data reporting. The agency’s deviance from proper data practices has gravely affected the public’s confidence in their data.

“Everyone monitoring DOH data release has seen changes in format, some confusing labels, fractions of data that they admit to needing validation. All of this has implications on the construction and results of a model,” Arceo stated. She further noted that data trackers demand a faster and more accurate release of information.

The doubt on necessary non-identifying variables — variables that cannot be used to re-identify respondents or patients — on COVID-19 cases in the data, the large number of missing values, and data duplication are just some of the issues Cayton listed. Part of the issue is the removal of the DateRepRem by DOH.

Figures from the August 28 pandemic data show discrepancies in the reported numbers according to Cayton as not all the 134,474 recoveries have corresponding DateRecover data (in orange) and there is a lack of dates for different health statuses. Graphics by Hanz Salvacion

“What the loss of the [DateRepRem] variable also implies is the difficulty to track recoveries and deaths to the different geographic areas. There is no way to facilitate the analysis without the DateRepRem, especially when DOH would also purge rows of information due to post-confirmation validation of duplicate data,” Cayton emphasized.

On the other hand, DOH implemented Oplan Recovery, an initiative which monitors the time-based status of confirmed COVID-19 deaths and recoveries, in its data reporting as the agency seeks to enhance its data collection and reconciliation efforts. According to DOH, Oplan Recovery “tags a patient as recovered when certain conditions are met even without repeat RT-PCR testing.”

This drew flak as DOH logged mild and asymptomatic cases as recoveries under the mass recovery scheme. The adjustment, however, is based on scientific evidence according to DOH as the World Health Organization updated their recommendation to discharge asymptomatic patients 10 days after a positive test result and to discharge symptomatic patients 10 days after they show symptoms, plus at least three days without symptoms.

Centers for Disease Control and Prevention (CDC) further recommended implementing a symptom-based strategy where mild and asymptomatic cases can be discharged after monitoring if their symptoms disappear instead of waiting for a negative test result.

“A test-based strategy is no longer recommended because, in the majority of cases, it results in prolonged isolation of patients who continue to shed detectable SARS-CoV-2 RNA but are no longer infectious,” CDC said.

While being a helpful and sound strategy, Rabajante said that the program should have been implemented a long time ago. He agrees with Oplan Recovery as long as we can double-check if patients have really recovered, but we have been lacking in monitoring our contact tracing policies.

“The science of time-based recovery is sound as it is a practice done by many countries combatting COVID-19. The problem is its implementation, of which I have shared earlier,” Cayton noted.

Aside from the issues in data reporting, Cayton pointed out that DOH is lacking in opening a public discussion of their statistical methodologies and their release of statistics. Also, their lack of a “unified voice” in communicating risk further aggravates the pandemic as quarantine policies are based on the available data.

Better data, better normal

An example of a time series data showing the time varying R dashboard shows how well the Philippines is doing in addressing the coronavirus pandemic. Screenshot of dashboard from endcov.ph. Infographics by Hanz Salvacion

Arceo acknowledged the challenge for DOH to deliver the figures accurately.

“This is our first time to experience a pandemic, and we are addressing a big, big population. Still, much can be done by constant evaluation of procedures, communication, consultation, and use of technology,” Arceo said.

For Rabajante, data reporting must be coupled with effective science communication and risk communication. He also suggested that it would be helpful if DOH could address the data automation from different sources to minimize backlogs and errors in the pandemic statistics.

“DOH can also organize regular technical fora to explain to the media (then to the public) the details of the data and the analysis and interpretation of data so they can understand the issues and address misconceptions,” Rabajante added.

Cayton pointed out that time-based recovery must be the standard practice for hospitals, quarantine facilities, and LGUs. Introducing and bringing back more non-identifying variables such as the DateRepRem would be helpful in addressing the gaps of the COVID-19 data.

“With respect to the deviations earlier expressed, [DOH can improve their data reporting through] a better and unified system of data management of which DOH will enforce compliance from all data sources,” Cayton said, addressing the lack of data. He further suggested that the agency could release an anonymized contract tracing data to monitor and assess our tracing policies.

Like what Rabajante pointed out, peer-reviewing pandemic data and making them accessible to the public would be crucial for the scientific community and LGUs to formulate data-driven and informed decisions in optimizing our pandemic response.

“Data is the backbone of evidence-based policymaking. Without accurate data, how can we be confident with our policies?” Rabajante stated.

Making the metrics matter

With mathematicians, statisticians, and experts in science communication bringing light on the frontlines of the infodemic we are facing, it is high time for us to value accurate, honest, and up-to-date data in addressing the COVID-19 pandemic.

“Having these accurate, honest, and up-to-date data accessible to the public lets the people assess government response and such open data provides a way for the public to place checks on the COVID-19 government policies. It also aids the public with information on the situation of different areas of the country to help people make informed decisions on their daily lives using the available information,” Cayton said.

When data is clear-cut, accurate, and up-to-date, projections and recommendations deduced from our mathematical models can be made with confidence. Otherwise, we are bound to further struggle in a public health crisis.

#

--

--

Scientia
Scientia

The official student publication of the College of Science, UP Diliman.