Tidytlg: An R Package for Clinical Reporting using Tidyverse

Sheng-Wei Wang
Johnson & Johnson Open Source
3 min readJan 10, 2023

Statistical programmers in the pharmaceutical industry have been using SAS for clinical trial reporting. Open-source technologies have been rapidly growing in recent years, and more and more pharmaceutical companies are starting to use R as an alternative or additional software for clinical trial reporting. To ease the transition for programmers from SAS to R, we developed the tidytlg package for R to create the tables, listings, and graphs (TLG) for clinical study reports.

What is tidytlg?

Tidytlg is developed using tidyverse as a backbone and offers a suite of analysis functions to summarize descriptive statistics (univariate statistics and counts (percentages)) for table creation and a function to convert analysis results to RTF/HTML outputs. Tidytlg can integrate plot objects created by ggplot2 or a png file with titles and footnotes to produce RTF/HTML output for graphic work. Tidytlg function design aims to strike a balance between our internal SAS macro suites and R to reduce the learning curve of SAS programmers to begin adopting R for clinical trial reporting.

Why Open Source?

We developed tidytlg initially as an internal R package for the Clinical & Statistical Programming group within Johnson & Johnson to create TLGs in production. We found that programmers can quickly learn and apply it to create TLGs in R, meeting our initial goal. We wish to share it with the community from which we have learned so much and welcome your feedback and contribution to enhancing the package in the future. Additionally, we value the transparency that comes with open-source, and from open-sourcing this package, we get the added benefit of easing submission efforts.

How does it work?

The TLG programming workflow usually consists of the following steps:

1. Process the analysis datasets (e.g., filter data, convert character variable to factor, etc.)

2. Generate analysis results by creating analysis rows of summary statistics (for tables) or plots (for graphs)

3. Output analysis results in a designated format such as RTF or HTML.

This workflow can be implemented in multiple ways with this package:

· Functional method: build a custom script step by step for each TLG

· Metadata method: create a generic script that utilizes column and table metadata to produce each TLG result.

Functional method example

library(dplyr)
library(tidytlg)

# Note cdisc_adsl is built into the package for use
ittpop <- cdisc_adsl %>%
filter(ITTFL == "Y")

# frequency of Intend-to-Treat patients by planned treatment
tbl1 <- freq(ittpop,
rowvar = "ITTFL",
statlist = statlist("n"),
colvar = "TRT01P",
rowtext = "Analysis Set: Intend-to-Treat Population",
subset = ITTFL == "Y")

# N, MEAN (SD), MEDIAN, RANGE, IQ Range of age by planned treatment
tbl2 <- univar(ittpop,
rowvar = "AGE",
colvar = "TRT01P",
row_header = "Age (Years)")

# frequency of Race by planned treatment
tbl3 <- freq(ittpop,
rowvar = "RACE",
statlist = statlist(c("N", "n (x.x%)")),
colvar = "TRT01P",
row_header = "Race, n(%)")

# combine results together
tbl <- bind_table(tbl1, tbl2, tbl3)

# conver to hux object ----------------------------------------------------
gentlg(huxme = tbl ,
orientation = "landscape",
file = "DEMO",
title = "Custom Method",
footers = "Produced with tidytlg",
colspan = list(c("", "", "Xanomeline", "Xanomeline")),
colheader = c("", "Placebo", "High", "Low"),
wcol = .30)

Metadata method example

library(dplyr)
library(tidytlg)

adsl <- cdisc_adsl

table_metadata <- tibble::tribble(
~anbr,~func, ~df, ~rowvar, ~rowtext, ~row_header, ~statlist, ~subset,
1, "freq", "adsl", "ITTFL", "Analysis set: itt", NA, statlist("n"), "ITTFL == 'Y'",
2, "univar", "adsl", "AGE", NA, "Age (Years)", NA, NA,
3, "freq", "adsl", "RACE", NA, "Race, n(%)", statlist(c("N", "n (x.x%)")), NA
) %>%
mutate(colvar = "TRT01PN")

tbl <- generate_results(table_metadata,
column_metadata_file = system.file("extdata/column_metadata.xlsx", package = "tidytlg"),
tbltype = "type1")

# conver to hux object -----------------------------------------------------------------
tblid <- "Table01"

gentlg(huxme = tbl,
orientation = "landscape",
file = tblid,
title_file = system.file("extdata/titles.xls", package = "tidytlg"),
wcol = .30)

Closing Thoughts

We have given a high-level overview of the tidytlg package. Hopefully, the above description can provide you with the overall picture of what tidytlg can accomplish and the design thinking for easing the transition from SAS to R in clinical reporting. We welcome your feedback on bug fixes, functionality improvements, and contributions to improving this package.

Check out our documentation and repository on Github: https://github.com/pharmaverse/tidytlg

--

--

Sheng-Wei Wang
Johnson & Johnson Open Source

I am a technical lead of the Methodology & Innovation group within Clinical & Statistical Programming at Johnson & Johnson.