The pharmaverse is a collaborative project where leading pharmaceutical companies and passionate individuals come together to create helpful tools for clinical reporting. By using R programming and the open-source community, the pharmaverse makes it easier to gain insights and increase transparency in the pharma industry.
In this post, let’s explore the top 5 pharmaverse packages (by GitHub Stars ), their interesting features, and how they can be used in clinical data analysis and reporting.
Let’s start our journey with {rtables}, a package contributed by Roche, which enables the creation of tables for reporting clinical trials. It offers a flexible and efficient way to generate publication-quality tables, simplifying the reporting process and ensuring consistency across different trials.
{rtables} was primarily designed to address the needs of the pharmaceutical industry for creating regulatory-ready tables for clinical trial reporting. You can use it to generate tables to summarize patient demographics, adverse events, efficacy endpoints, and other clinical trial data.
The {rtables} package offers several features that make it a useful tool for creating complex tables:
The following example demonstrates how to create a basic table using the {rtables} package. We start by installing and loading the required libraries.
Next, we create a simulated dataset representing clinical trial data. We then define a function to calculate the mean of a biomarker and use {rtables} functions to build and print the table.
# Install rtables if it's not already installed if (!requireNamespace("rtables", quietly = TRUE)) { install.packages("rtables") } # Load required library library(rtables) # Create a simulated dataset set.seed(123) ex_adsl <- data.frame( ARM = sample(c("Placebo", "Drug X", "Combination"), 100, replace = TRUE), BMRKR2 = sample(c("Low", "Medium", "High"), 100, replace = TRUE), RACE = sample(c("Asian", "Black", "White"), 100, replace = TRUE), SEX = sample(c("Male", "Female"), 100, replace = TRUE), BMRKR1 = rnorm(100, mean = 5, sd = 2) ) # Define a function to calculate the mean of a biomarker biomarker_ave <- function(x, ...) { val <- if (length(x) > 0) round(mean(x), 2) else "no data" in_rows("Biomarker 1 (mean)" = rcell(val)) } # Create the table table <- basic_table() |> split_cols_by("ARM") |> split_cols_by("BMRKR2") |> add_colcounts() |> split_rows_by("RACE", split_fun = trim_levels_in_group("SEX")) |> split_rows_by("SEX") |> summarize_row_groups() |> analyze("BMRKR1", biomarker_ave) |> build_table(ex_adsl) # Print the table print(table)
Results
You can learn more about rtables:
Next let’s take a look at the {admiral} package, initially developed through a collaboration between Roche and GSK. It provides a toolbox of reusable functions and utilities with dplyr-like syntax to prepare CDISC ADaM (Analysis Data Model) datasets. It serves as a valuable resource for statistical programmers looking to build ADaMs according to varying analysis needs while ensuring traceability and compliance with FDA standards. Many more pharma companies and CROs have contributed to the package since its inception in 2021.
Let’s take a look at some features of the package:
Let’s see how to use the {admiral} package to prepare an ADaM dataset.
We start by loading the necessary libraries and creating a sample study dataset. We then derive the ADSL (Subject-Level Analysis Dataset) and add additional variables for further analysis.
library(admiral) library(dplyr) library(lubridate) # Sample study data study_data <- tibble::tribble( ~USUBJID, ~AGE, ~SEX, ~ARM, ~RANDDTC, ~RAND2DTC, ~VISIT, ~VISITDY, ~VSDTC, ~VSTPT, ~VSORRESU, ~VSORRES, ~PARAMCD, "01", 34, "M", "Placebo", "2022-12-10", "2023-01-15", "Screening", -7, "2023-01-08", "Pre-dose", "kg", 80, "WEIGHT", "02", 45, "F", "Treatment", "2023-01-02", "2023-01-17", "Baseline", 1, "2023-01-17", "Pre-dose", "kg", 65, "WEIGHT", "03", 54, "F", "", "2022-10-16", "2023-01-09", "Screening", -7, "2023-01-09", "Pre-dose", "kg", 78, "WEIGHT" ) # Function to aggregate by age group format_agegr1 <- function(var_input) { case_when( var_input < 35 ~ "<35", between(var_input, 34, 50) ~ "35-49", var_input > 50 ~ ">64", TRUE ~ "Missing" ) } # Derive ADSL (Subject-Level Analysis Dataset) adsl <- study_data |> select(USUBJID, AGE, SEX, ARM, RANDDTC, RAND2DTC) |> distinct() |> # Convert blanks strings to NA convert_blanks_to_na() |> # admiral does not yet support aggregation function, but dplyr can be used mutate( AGEGR1 = format_agegr1(AGE) ) |> # Convert from character to DATE derive_vars_dt( dtc = RANDDTC, new_vars_prefix = "TRTS" ) |> derive_vars_dt( dtc = RAND2DTC, new_vars_prefix = "TRTE" ) |> derive_vars_duration( new_var = TRTD, start_date = TRTSDT, end_date = TRTEDT, add_one = FALSE ) # Display the ADSL dataset print(adsl)
Results
More on {admiral}:
{teal} is an open-source R Shiny framework developed by Roche that enables the creation of interactive data exploration applications for the pharmaceutical industry. {teal} is particularly well-suited for exploring and analyzing data from clinical trials, enabling researchers and clinicians to quickly identify trends, patterns, and insights.
{teal}’s reporting functionality can be used to generate regulatory-ready tables, figures, and listings. Study teams currently use it to explore data interactively and get the code to reproduce those TLGs. In the future, we hope to use it for submission to governing bodies. It can also be used to build interactive dashboards for monitoring and analyzing adverse events in clinical trials, supporting pharmacovigilance efforts. Its modular design allows for the integration of specialized modules for the analysis and visualization of high-dimensional biomarker data.
Here’s the Patient Profile {teal} application for patient-level analysis of clinical trial data from the teal.gallery.
Here’s how you can run the app yourself:
source("https://raw.github.com/insightsengineering/teal.gallery/main/_internal/utils/sourceme.R") # Run the app restore_and_run("patient-profile", package_repo = "https://insightsengineering.r-universe.dev")
You can also find the deployed version of the application.
Some Resources on {teal}
The {riskmetric} package provides a framework to quantify the “risk” of R packages by assessing various metrics. Developed by the R Validation Hub, it helps organizations evaluate the quality and suitability of R packages for validated environments. The resulting risk is parameterized by the organization, which makes the decision on how to weigh the risk from the different metrics.
The following example demonstrates how to use the {riskmetric} package to evaluate the risk of selected R packages. We start by loading the necessary libraries and using {riskmetric} functions to assess and score the packages.
# Load necessary libraries library(dplyr) library(riskmetric) # Assess and score R packages pkg_ref(c("riskmetric", "utils", "tools")) %>% pkg_assess() %>% pkg_score()
Results
It is closely related to {riskassessment} (the app’s main goal is to help those making “package inclusion” requests for validated GxP environments) and {riskscore} (data package for cataloging riskmetric results across public repositories).
More on Riskmetric
{tidyCDISC} is an open-source R package developed by Biogen that provides a set of functions for tidying and manipulating CDISC (Clinical Data Interchange Standards Consortium) datasets. It aims to simplify the process of working with CDISC data by providing an intuitive interface for data transformation tasks and ensuring consistency with the principles of tidy data.
Here’s a demo version of {tidyCDISC} you can try.
In the documentation, you can find more examples of applications and how to use {tidyCDISC}.
More on tidyCDISC
Finally, let’s look at {xportr}, an open-source R package developed by GSK, Atorus, and Appsilon that simplifies the process of creating CDISC-compliant XPT files directly from R. It serves as a valuable tool for clinical programmers working with ADaM or SDTM datasets.
This package ensures compatibility with regulatory submission requirements, providing a seamless bridge between R and traditional SAS-based workflows.
{xportr} is designed to handle the intricacies of the XPORT format, making it easier to share data across different platforms and with regulatory authorities. This capability is crucial for teams working in environments where both R and SAS are used, facilitating smooth and compliant data exchanges.
Features of {xportr}
In summary, {xportr} is a valuable tool for clinical programmers working with CDISC data, as it helps ensure regulatory compliance, data quality, and workflow efficiency when creating XPT files for clinical trials and submissions.
More on xportr
The pharmaverse offers a rich ecosystem of tools designed to streamline clinical research workflows, ensuring high-quality data management and reporting. By leveraging packages like {rtables}, {admiral}, {teal}, {riskmetric}, {tidyCDISC}, and {xportr}, pharmaceutical companies can enhance their data analysis capabilities, ensure regulatory compliance, and drive innovation in clinical research. Remember to give the packages that you use and value a star on GitHub.
To receive the latest updates on what’s new in the pharmaverse, subscribe to the periodic newsletter!
The post appeared first on appsilon.com/blog/.