Win Odds Confidence Intervals in R

Jan 10, 2020 5 min read R

Continuous distributions

Mann-Whitney estimate for the win probability

Consider two independent, continuous RVs (random variables) \(\xi\) and \(\eta\). The following probability is called the win probability of RV \(\eta\) against the RV \(\xi\)

\[\begin{align*} \theta=P(\eta>\xi). \end{align*}\]

Given an i.i.d. (independent, identically distributed) random sample from \(\xi,\) denoted by \(X=(X_1,\cdots,X_{n_1})\) and an i.i.d. sample from \(\eta\) denoted by \(Y=(Y_1,\cdots,Y_{n_2})\) we are interested in estimating the unknown parameter \(\theta.\)

The following estimator is called the Mann-Whitney estimator (or, the win proportion) for the win probability

\[\begin{align*} \hat\theta_N=\frac{1}{n_1n_2}\sum_{i=1}^{n_1}\sum_{j=1}^{n_2}I(X_i<Y_j). \end{align*}\] Here \(N=n_1+n_2\) is the total sample size, whereas \(I(\cdot)\) is the indicator function which takes the value 1 if the underlying inequality is true and 0 otherwise. When \(n_1\rightarrow+\infty,\,n_2\rightarrow+\infty\) then the win proportion is a consistent estimator (convergence in probability) for the win probability \[\begin{align*} \hat\theta_N\longrightarrow\theta. \end{align*}\] If \(\frac{n_1}{N}\rightarrow \alpha,\) as \(n_1\rightarrow+\infty,\,n_2\rightarrow+\infty,\) then the win proportion is also asymptotically normal (Van der Vaart 2000) \[\begin{align*} \sqrt{N}(\hat\theta_N-\theta)\Longrightarrow{\mathcal N}\left(0,\frac{1}{1-\alpha}\sigma_{10}^2+\frac{1}{\alpha}\sigma_{01}^2\right). \end{align*}\] Here, \[\begin{align*} \sigma_{10}^2=Cov(I(\xi<\eta),I(\xi'<\eta))=P(\xi<\eta,\xi'<\eta)-P(\xi<\eta)^2,\\ \sigma_{01}^2=Cov(I(\xi<\eta),I(\xi<\eta'))=P(\xi<\eta,\xi<\eta')-P(\xi<\eta)^2, \end{align*}\] where \(\xi'\) has the same distribution as \(\xi\), \(\eta'\) has the same distribution as \(\eta\). All \(\xi,\xi',\eta,\eta'\) are independent.

Application to exponential distributions

Suppose now that \(\xi\sim{\mathbb E}(\lambda)\) and \(\eta\sim{\mathbb E}(\mu).\) In this case,

\[\begin{align*} \sigma_{10}^2=\frac{2\lambda}{(\lambda+\mu)(2\lambda+\mu)}-\frac{\lambda^2}{(\lambda+\mu)^2}=\frac{\lambda^2\mu}{(\lambda+\mu)^2(2\lambda+\mu)}\\ \sigma_{01}^2=\frac{\lambda}{\lambda+2\mu}-\frac{\lambda^2}{(\lambda+\mu)^2}=\frac{\lambda\mu^2}{(\lambda+\mu)^2(\lambda+2\mu)}, \end{align*}\] therefore, the asymptotic variance of the win proportion will be \[\begin{align*} \sigma^2=\frac{\lambda\mu}{(\lambda+\mu)^2}\left(\frac{1}{1-\alpha}\frac{\lambda}{2\lambda+\mu}+\frac{1}{\alpha}\frac{\mu}{\lambda+2\mu}\right). \end{align*}\]

We can check this by the following simulations

n1 <- 700
n2 <- 100
N <- n1 + n2
m <- 1000
lambda <- 2
mu <- 10
alpha <- n1/(n1+n2)


k <- lambda/(lambda+mu)
WR <- NULL

for(i in 1:m){
  set.seed(i)
  X1 <- rexp(n1, lambda)
  X2 <- rexp(n2, mu)
  d <- expand.grid(x = X1, y = X2)
  d$w <- ifelse(d$y > d$x, 1, ifelse(d$y == d$x, 0.5, 0))
  WR[i] <- sqrt(N)*(mean(d$w) - k)
}


x0 <- 3
int <- seq(-x0, x0, 0.001)

Coeff0 <- mu*lambda/(lambda + mu)^2
Coeff <- Coeff0*(1/(1-alpha)*lambda/(2*lambda+mu) + 1/alpha*mu/(lambda + 2*mu))


hist(WR, nclass = 20, freq = FALSE, xlim = c(-x0, x0), 
     ylim = c(0, 1.1), col = "lightblue", border = "blue")
lines(int, dnorm(int, mean = 0, sd = sqrt(Coeff)), col = "2")

Definition of the win odds

Consider two independent, continuous RVs (random variables) \(\xi\) and \(\eta\). The odds of the win probability is called the win odds of RV \(\eta\) against the RV \(\xi\)

\[\begin{align*} \omega=\frac{P(\eta>\xi)}{P(\eta<\xi)}=\frac{\theta}{1-\theta}. \end{align*}\]

The Mann-Whitney estimate of the win probability can be transformed by the function \(f(x)=\frac{x}{1-x},\ \ x\in(0,1)\) to get an estimate for the win odds. Using the same transformation and the asymptotic normality of the Mann-Whitney estimate it is possible to construct asymptotic confidence intervals for the win odds, for a given asymptotic confidence level.

If \(\omega=1\) then the random variables \(\xi\) and \(\eta\) are stochastically equivalent, while \(\omega>1\) means that \(\eta\) is stochastically greater than (wins against) \(\eta.\) The case \(\omega<1\) means that \(\eta\) loses against \(\xi\). The asymptotic confidence interval of the win odds can be used to test the hypothesis whether \(\omega=1\).

To use the asymptotic normality result described above we need to estimate the asymptotic variance. The package sanon in R allows to estimate the asymptotic standard error of the Mann-Whitney estimate.

Ordinal random variables

(Gasparyan et al. 2021)

The package sanon

We will be using the package sanon (Kawaguchi and Koch 2015)

#install.packages("sanon")
library(sanon)

The dataset resp contains data from a randomized clinical trial to compare a test treatment to placebo for a respiratory disorder.

data(resp, package = "sanon")

head(resp)

##   center treatment sex age baseline visit1 visit2 visit3 visit4
## 1      1         A   F  32        1      2      2      4      2
## 2      2         A   F  37        1      3      4      4      4
## 3      1         A   F  47        2      2      3      4      4
## 4      2         A   F  39        2      3      4      4      4
## 5      1         A   M  11        4      4      4      4      2
## 6      2         A   F  60        4      4      3      3      4

The column visit4 is a numeric vector for patient global ratings of symptom control according to 5 categories (4 = excellent, 3 = good, 2 = fair, 1 = poor, 0 = terrible), measured at visit 4. To compare the effect of active treatment against the placebo we will use the win probability, which, as we defined previously, is an unknown theoretical quantity. The null hypothesis is that there is no treatment difference which can be written in terms of the win probability as \(\theta=0.5.\) The Mann-Whitney estimate of the win probability can be calculated as follows

fit <- sanon(visit4 ~ grp(treatment, ref="P"), data = resp)

fit

## Call:
## sanon.formula(formula = visit4 ~ grp(treatment, ref = "P"), data = resp)
## 
## Sample size: 111
## 
## Response levels:
## [visit4; 5 levels] (lower) 0, 1, 2, 3, 4 (higher)
## 
## Design Matrix:
##        [,1]
## visit4    1
## 
## Mann-Whitney Estimate 
##  for comparison [ A / P ] :
## visit4 
## 0.6174

confint(fit)

## M-W Estimate and 95% Confidence Intervals 
## :
##        Estimate  Lower  Upper
## visit4   0.6174 0.5173 0.7176

fit$p

##            [,1]
## [1,] 0.02150601

The p-value based on the asymptotic confidence interval of level 0.05 is less than 0.05, hence the null hypothesis of no treatment difference is rejected. The Mann-Whitney estimate can be transformed to get an estimate for the win odds.

confint(fit)$ci/(1-confint(fit)$ci)

##        Estimate    Lower    Upper
## visit4 1.614013 1.071762 2.540746

For the win odds the null hypothesis of no treatment difference is \(\omega=1.\) The win odds 1.61 characterizes the treatment effect difference.

References

Gasparyan, Samvel B, Elaine K Kowalewski, Folke Folkvaljon, Olof Bengtsson, Joan Buenconsejo, John Adler, and Gary G Koch. 2021. “Power and Sample Size Calculation for the Win Odds Test: Application to an Ordinal Endpoint in COVID-19 Trials.” Journal of Biopharmaceutical Statistics, 1–23. https://doi.org/10.1080/10543406.2021.1968893.

Kawaguchi, Atsushi, and Gary G. Koch. 2015. “sanon: An R Package for Stratified Analysis with Nonparametric Covariable Adjustment.” Journal of Statistical Software 67 (9): 1–37. https://doi.org/10.18637/jss.v067.i09.

Van der Vaart, Aad W. 2000. Asymptotic Statistics. Vol. 3. Cambridge university press. https://doi.org/10.1017/CBO9780511802256.

R Statistics Simulations sanon

Samvel B. Gasparyan

Biostatistician

Biostatistician in cardiovascular trials.