library('dplyr') library('ggplot2') library('stringr') library('tidyverse') library('matrixcalc') library('LaplacesDemon') # for Dirichlet distribution library('diagram')
This post shows how to use the elementary theory of discrete time Markov Chains to construct a multi-state model of patients progressing through various health states in a randomized clinical trial comparing different treatments for asthma management under the assumption that all patients in each of the two arms of the trial will eventually experience treatment failure. The post shows how to calculate transition probabilities using a simple multinomial Bayesian model and exploit the theory of absorbing Markov chains to calculate fundamental health metrics, including the expected time spent in each health state, survival curves, and the expected time to treatment failure for each of the two treatments. These estimates provide a natural way to compare the effectiveness of the two treatments and can be used to drive cost-effectiveness comparisons.
Although the theory is basic and the calculations are simple, variations of this model should be applicable to a wide range of problems in health economics and decision analysis.
library('dplyr') library('ggplot2') library('stringr') library('tidyverse') library('matrixcalc') library('LaplacesDemon') # for Dirichlet distribution library('diagram')
# names for transition probabilities trans_names <- function(x) { transitions <- paste(x, "-", 1:5, sep = "") return(transitions) } sim_res <- function(matrix, from_state, n_sims = 20000) { # Bayesian simulation using Dirichlet conjugate prior # matrix is the matrix of observed states # from_state is the row index of the initial state # n_sims is the number of simulations to run priors <- matrix(rep(1, 20), nrow = 4) # Prior parameters for Dirichlet dist HERE dist <- matrix[from_state, ] + priors[from_state, ] res <- rdirichlet(n_sims, dist) colnames(res) <- paste(from_state, "-", 1:5, sep = "") return(res) } smry <- function(res_matrix) { apply(res_matrix, 2, function(x) { c( mean = mean(x), lower = quantile(x, probs = 0.025), upper = quantile(x, probs = 0.975) ) }) } plot_row_dist <- function(matrix, treatment, start_state) { # code to plot the posterior marginal distributions # Inputs: # matrix the simulation matrix for the health state of interest # treatment: either Seretide or Fluticasone as a character # start_state: name of health state that agrees with matrix as a character treatment <- treatment start_state <- start_state plot_df <- matrix %>% as.data.frame() %>% pivot_longer(cols = everything(), names_to = "transitions", values_to = "prob") ggplot(plot_df, aes(x = prob)) + geom_histogram(aes(y = after_stat(density)), bins = 15, fill = "lightgrey", color = "black") + # histogram for each category geom_density(aes(y = after_stat(density)), color = "red", linewidth = 0.5) + # density line scale_x_continuous(breaks = scales::pretty_breaks(n = 5)) + xlab("probability") + facet_wrap(~transitions, scales = "free") + # faceting for each category labs(x = "Probability", y = " ") + ggtitle(paste(treatment, ": Posterior State Transition Probabilities for states starting in", start_state)) } # Function to compute the probability of being in each state at time t prob_at_time <- function(matrix, time, i_state) { u <- i_state index_eq_1 <- which(u == 1) m <- matrix.power(matrix, time) u_t <- u %*% m # Distribution at time t rownames(u_t) <- names(states)[index_eq_1] round(u_t, 2) } time_in_state2 <- function(tpm, n) { # Function to compute the expected time spent in each state # tpm - the transition probability matrix # n - the number of time periods I <- diag(5) # Initial state vectors s_time <- matrix(0, nrow = 5, ncol = 5) m <- matrix(0, nrow = 5, ncol = 5) for (i in 1:5) { for (j in 1:n) { m[i, ] <- I[i, ] %*% matrix.power(tpm, j) s_time[i, ] <- s_time[i, ] + m[i, ] colnames(s_time) <- c("STW", "UTW", "Hex", "Pex", "TF") rownames(s_time) <- c("STW", "UTW", "Hex", "Pex", "TF") } } return(s_time) }
The data used in this post originates from a four arm randomized trial comparing treatments for asthma management, including Seretide and Fluticasone, conducted by Kavuru et al. [3]. I report the data presented in the text by Welton et al. [5] and the paper by Briggs et al. [2]. The data comprise the number of patients in each of five health states at the end of each week for a 12-week follow-up period. The states are defined as follows: STW
= successfully treated week, UTW
= unsuccessfully treated week, Hex
= hospital-managed exacerbation ,Pex
= primary care-managed exacerbation, and TF
= treatment failure.
State tables such as these are a common way to summarize the results of a clinical study.
states <- c( "STW" = "sucessfully treated week", "UTW" = "unsucessfully treated week", "Hex" = "hospital-managed exacerbation", "Pex" = "primary care-managed exacerbation", "TF" = "treatment failure" ) treatments <- c("Seretide", "Fluticasone") s_state <- matrix( c( 210, 60, 0, 1, 1, 88, 641, 0, 4, 13, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 81 ), nrow = 5, byrow = TRUE, dimnames = list(names(states), names(states)) ) f_state <- matrix( c( 66, 32, 0, 0, 2, 42, 752, 0, 5, 20, 0, 0, 0, 0, 0, 0, 4, 0, 1, 0, 0, 0, 0, 0, 156 ), nrow = 5, byrow = TRUE, dimnames = list(names(states), names(states)) )
s_state
STW UTW Hex Pex TF STW 210 60 0 1 1 UTW 88 641 0 4 13 Hex 0 0 0 0 0 Pex 1 0 0 0 1 TF 0 0 0 0 81
f_state
STW UTW Hex Pex TF STW 66 32 0 0 2 UTW 42 752 0 5 20 Hex 0 0 0 0 0 Pex 0 4 0 1 0 TF 0 0 0 0 156
The state diagram, which shows directed arcs between the states, is a useful way to visualize the Markov model. The arcs represent the possible transitions between states, and the numbers on the arcs represent the from-to state transitions. Since there are no arcs emanating from TF
it is clear that this state is being modeled as an absorbing state, similar to death in a survival model.
The Bayesian is a simple conjugate prior model with a Multinomial likelihood function and Dirichlet prior distribution. Because these two distributions are conjugate, the posterior distribution will also be a Dirichlet.
The Dirichlet distribution is a multivariate generalization of the beta distribution, which is parameterized by a vector of positive real numbers and is defined over a simplex a generalization of the concept of a triangle to higher dimensions. The probability density function (PDF) of the Dirichlet distribution is given by:
where ,
is the indicator function,
is the gamma function, and the simplex
is the space of probability distributions. You can think of the Dirichlet distribution as a distribution of probability distributions. It models proportions or probabilities that sum to one, such as the transition probabilities in a Markov chain, and is often used as a prior distribution in Bayesian models since it is conjugate with the Multinomial distribution. (See Tufts in the references below.). The parameters of the Dirichlet posterior distribution are the vector sum of the prior parameters and the observed counts. When the
parameters are all equal, the Dirichlet distribution is symmetric and uniform over the simplex. When the parameters are unequal, the distribution is skewed towards the larger parameters. In the code below, the
parameters are set to 1.
This section of code uses the sim_res
function to simulate the posterior distribution of the transition probabilities for Seretide, starting in each of the four transient states. The results are summarized using the smry
function, which computes the mean and 95% credible intervals for each transition probability.
s_STW_sim <- sim_res(matrix = s_state, from_state = 1) s_STW_smry <- smry(s_STW_sim) s_UTW_sim <- sim_res(matrix = s_state, from_state = 2) s_UTW_smry <- smry(s_UTW_sim) s_Hex_sim <- sim_res(matrix = s_state, from_state = 3) s_Hex_smry <- smry(s_Hex_sim) s_Pex_sim <- sim_res(matrix = s_state, from_state = 4) s_Pex_smry <- smry(s_Pex_sim)
This code plots the marginal posterior distributions of the transition probabilities for Seretide starting in health state, STW
. Note that the marginal distributions of a Dirichlet distribution are Beta distributions. The plot supports this theoretical result.
This section of code uses the sim_res
function to simulate the posterior distribution of the transition probabilities for Fluticasone, starting in each of the four transient states. The results are summarized using the smry
function, which computes the mean and 95% credible intervals for each transition probability.
f_STW_sim <- sim_res(matrix = f_state, from_state = 1) f_STW_smry <- smry(f_STW_sim) f_UTW_sim <- sim_res(matrix = f_state, from_state = 2) f_UTW_smry <- smry(f_UTW_sim) f_Hex_sim <- sim_res(matrix = s_state, from_state = 3) f_Hex_smry <- smry(f_Hex_sim) f_Pex_sim <- sim_res(matrix = s_state, from_state = 4) f_Pex_smry <- smry(f_Pex_sim)
This code plots the marginal posterior distributions of the transition probabilities for Fluticasone starting in STW.
In this section, we present some theoretical results for discrete time absorbing Markov chains that are applicable to analyzing the asthma treatment data. The first step is to recover the mean transition probabilities from the posterior distributions computed above. Using these, we construct the Fundamental Matrix for the absorbing Markov chains associate with each treatment group.
# Seretide transition Probabilities s_TP <- rbind( s_STW_smry[1, ], s_UTW_smry[1, ], s_Hex_smry[1, ], s_Pex_smry[1, ] ) s_TP <- rbind(s_TP, c(0, 0, 0, 0, 1)) # Add the absorbing state TF colnames(s_TP) <- c("STW", "UTW", "Hex", "Pex", "TF") rownames(s_TP) <- c("STW", "UTW", "Hex", "Pex", "TF") # s_TP # Fluticasone transition Probabilities f_TP <- rbind( f_STW_smry[1, ], f_UTW_smry[1, ], f_Hex_smry[1, ], f_Pex_smry[1, ] ) f_TP <- rbind(f_TP, c(0, 0, 0, 0, 1)) # Add the absorbing state TF colnames(f_TP) <- c("STW", "UTW", "Hex", "Pex", "TF") rownames(f_TP) <- c("STW", "UTW", "Hex", "Pex", "TF")
round(s_TP, 2)
STW UTW Hex Pex TF STW 0.76 0.22 0.00 0.01 0.01 UTW 0.12 0.85 0.00 0.01 0.02 Hex 0.20 0.20 0.20 0.20 0.20 Pex 0.29 0.14 0.14 0.14 0.28 TF 0.00 0.00 0.00 0.00 1.00
round(f_TP, 2)
STW UTW Hex Pex TF STW 0.64 0.31 0.01 0.01 0.03 UTW 0.05 0.91 0.00 0.01 0.03 Hex 0.20 0.20 0.20 0.20 0.20 Pex 0.29 0.14 0.14 0.14 0.29 TF 0.00 0.00 0.00 0.00 1.00
For an absorbing Markov Chain, each entry of the fundamental matrix,
, gives the expected number of times the process is in state
given that it starts in state
. For our example, this corresponds to the number of weeks that a patient is expected to spend in each of the transient states before reaching the absorbing state, TF.
The first step in getting to is to partition the transition matrix into sub-matrices of
of transient state transition probabilities, and
, absorbing state transition probabilities as illustrated in the figure below. If there are
transitive states and
absorbing state,
is a
matrix,
is a
matrix,
is a
matrix, and
is a
identity matrix. The figure is reproduced from Professor Nickolay Atanasov’s Chapter 11 notes on Markov chains [1] which presents lucid account of the theory presented here.
This code partitions the transition probability matrix as described above. Note that we only have one absorbing state.
Q_s <- s_TP[1:4, 1:4] # Extract the sub-matrix of transition probabilities for non-absorbing states rownames(Q_s) <- names(states)[1:4] colnames(Q_s) <- names(states)[1:4] # Set the row and column names to the state names # round(Q_s,3) Q_f <- f_TP[1:4, 1:4] # Extract the sub-matrix of transition probabilities for non-absorbing states rownames(Q_f) <- names(states)[1:4] colnames(Q_f) <- names(states)[1:4] # Set the row and column names to the state names # round(Q_f,3)
round(Q_s, 2)
STW UTW Hex Pex STW 0.76 0.22 0.00 0.01 UTW 0.12 0.85 0.00 0.01 Hex 0.20 0.20 0.20 0.20 Pex 0.29 0.14 0.14 0.14
round(Q_f, 2)
STW UTW Hex Pex STW 0.64 0.31 0.01 0.01 UTW 0.05 0.91 0.00 0.01 Hex 0.20 0.20 0.20 0.20 Pex 0.29 0.14 0.14 0.14
Calculating involves computing the inverse of the matrix
which is easy enough to do in
R
. Remember each entry of N gives the expected number of times that an absorbing process is expected to be in the transient state
before absorption, given that it starts in state
. So, given that patients start in health state
STW
, the expected number of weeks that patients are expected to spend in the various states before treatment failure are given by the first rows of and
respectively.
.
I <- diag(4) # Identity matrix of size 4 N_s <- solve(I - Q_s) # Fundamental matrix for Seretide # round(N_s,3) N_f <- solve(I - Q_f) # Fundamental matrix for Fluticasone # round(N_f,3)
round(N_s, 2)
STW UTW Hex Pex STW 22.24 34.60 0.25 0.51 UTW 18.88 36.35 0.23 0.49 Hex 13.47 23.09 1.46 0.63 Pex 12.86 21.54 0.37 1.53
round(N_f, 2)
STW UTW Hex Pex STW 6.91 26.22 0.18 0.34 UTW 4.55 29.14 0.16 0.33 Hex 3.78 17.96 1.42 0.52 Pex 3.69 16.56 0.32 1.42
The expected time to absorption from each transient state is the sum of the expected number of times the process visits each transient state before absorption occurs. It is given by the formula , where
is a vector of ones. This is a useful measure for any healthcare economics study as the monetary costs and QALYS assigned to each state determine the cost-effectiveness of a treatment.
# Calculation for Seretide c <- c(1,1,1,1) E_s <- N_s %*% c # Expected time to absorption for Seretide colnames(E_s) <- "Seretide" #Calculation for Fluticasone E_f <- N_f %*% c # Expected time to absorption for Fluticasone colnames(E_f) <- "Fluticasone" # Combine the expected times to absorption for both treatments E <- cbind(E_s, E_f) %>% data.frame() round(E,2)
Seretide Fluticasone STW 57.60 33.65 UTW 55.96 34.18 Hex 38.66 23.68 Pex 36.29 22.00
Probability of being in each state at time t starting from state STW as given by where
is the initial state vector, and
is the transition probability matrix. To provide a specific example, the following code calculates the probability of patients being in the health state
STW
at week 13, the week after the follow up period of the study, given that they started in STW
.
# Seretide t <- 13 u <- c(1, 0, 0, 0, 0) # Initial state vector, starting in STW spt <- prob_at_time(s_TP, t, u) # Fluticasone t <- 12 u <- c(1, 0, 0, 0, 0) # Initial state vector, starting in STW fpt <- prob_at_time(f_TP, t, u) p_in_state <- rbind(spt, fpt) rownames(p_in_state) <- c("Seretide start STW", "Fluticasone start STW") p_in_state
STW UTW Hex Pex TF Seretide start STW 0.29 0.51 0 0.01 0.19 Fluticasone start STW 0.10 0.58 0 0.01 0.31
From here, it is trivial to compute the probability that patients will not be in the absorption state as time progresses and plot a standard survival curve.
N <- 156 # weeks s_surv_dat <- vector("numeric", length = N) f_surv_dat <- vector("numeric", length = N) u <- c(1, 0, 0, 0, 0) # Initial state vector, starting in STW for (t in 1:N) { s_dist <- prob_at_time(s_TP, t, u) s_surv_dat[t] <- sum(s_dist[1:4]) # Sum of probabilities of being in transient states) f_dist <- prob_at_time(f_TP, t, u) f_surv_dat[t] <- sum(f_dist[1:4]) # Sum of probabilities of being in transient states } survival_df <- data.frame( time = 1:N, s_prob = s_surv_dat, f_prob = f_surv_dat ) survival_df_l <- survival_df %>% pivot_longer( cols = c(s_prob, f_prob), names_to = "treatment", values_to = "probability" ) %>% mutate(treatment = recode(treatment, s_prob = "Seretide", f_prob = "Fluticasone")) ggplot(survival_df_l) + geom_line(aes( x = time, y = probability, group = treatment, color = treatment ), linewidth = 1.5) + labs( title = "Probability of being in transient states over time", x = "Time (weeks)", y = "Probability" ) + scale_x_continuous(breaks = seq(0, N, by = 5)) + scale_y_continuous(labels = scales::percent_format(scale = 1))
The curves clearly suggest that Seretide would be the preferred treatment.
This section of code computes a matrix that provides the expected time the Markov chain will spend in each state up to some time n. Each row of the matrix assumes the process starts in the state designated by the row name. The entry for each column of that row is the expected time spent in that state up to time n. Matrices are computed for both Seretide and Fluticasone.
Here we choose n = 26 weeks.
Note that in a healthcare cost-effectiveness model, these times could be used to compute the financial and quality of life costs for patients who survive to n weeks.
n <- 26 s_state_times <- time_in_state2(tpm = s_TP, n = n) f_state_times <- time_in_state2(tpm = f_TP, n = n)
# Seretide round(s_state_times, 2)
STW UTW Hex Pex TF STW 8.61 12.17 0.09 0.19 4.94 UTW 6.63 13.59 0.08 0.18 5.52 Hex 5.12 8.25 0.36 0.42 11.85 Pex 5.02 7.61 0.27 0.33 12.77 TF 0.00 0.00 0.00 0.00 26.00
# Fluticasone round(f_state_times, 2)
STW UTW Hex Pex TF STW 3.75 13.61 0.11 0.19 8.34 UTW 2.36 15.32 0.08 0.18 8.07 Hex 2.30 9.28 0.36 0.42 13.64 Pex 2.31 8.51 0.28 0.33 14.57 TF 0.00 0.00 0.00 0.00 26.00
Here, for both treatments, we compute the total time the chain spends in the transient states, assuming that the process starts in STW
. These times agree nicely with the expected time to absorption computed above.
# Seretide s_time_in_STW <- sum(s_state_times[1,1:4]) # Exclude the absorbing state TF f_time_in_STW <- sum(f_state_times[1,1:4]) # Exclude the absorbing state TF time_in_STW <- data.frame( Seretide = s_time_in_STW, Fluticasone = f_time_in_STW ) round(time_in_STW,2)
Seretide Fluticasone 1 21.06 17.66
I have constructed a very simple, Bayesian Markov Chain model to analyse the data for two arms of a clinical trial. This model should be relevant to any multi-state, time-to-event analysis where the data are given as state counts collected at regular intervals. In particular, the model should be helpful for analyzing disease progression data that meet these criteria. Coding the model with R
from first principles is straight forward because R
was designed for work with matrices, probability distributions, and Monte-Carlo simulations.
I specifically do not want to claim that model presented here is an adequate final analysis for the asthma treatment data. This model is just the beginning of a serious analysis. However, it does demonstrate how straight forward it would be to conduct different kinds of sensitivity analyses or investigate the effects of non-informative priors.
[1] Atanasov, N Chapter 11: Markov Chains
[2] Briggs AH, Ades AE, Price MJ. Probabilistic Sensitivity Analysis for Decision Trees with Multiple Branches: Use of the Dirichlet Distribution in a Bayesian Framework. Medical Decision Making, 2003
[3] Kavuru M, Melamed J, Gross G, Laforce C, House K, Prillaman B, Baitinger L, Woodring A, and Shah T, (2000) Salmeterol and fluticasone propionate combined in a new powder inhalation device for the treatment of asthma: a randomized, double-blind, placebo-controlled trial
[4] Tufts, C. The Little Book of LDA
[5] Welton NJ, Sutton AJ, Cooper NJ, Abrams KR, and Ades AE (2010) Evidence Synthesis for Decision Making in Healthcare. Cambridge University Press.