# Win Odds Confidence Intervals in R

## Continuous distributions

### Mann-Whitney estimate for the win probability

Consider two independent, continuous RVs (random variables) \(\xi\) and \(\eta\). The following probability is called the *win probability* of RV \(\eta\) against the RV \(\xi\)

\[\begin{align*} \theta=P(\eta>\xi). \end{align*}\]

Given an i.i.d. (independent, identically distributed) random sample from \(\xi,\) denoted by \(X=(X_1,\cdots,X_{n_1})\) and an i.i.d. sample from \(\eta\) denoted by \(Y=(Y_1,\cdots,Y_{n_2})\) we are interested in estimating the unknown parameter \(\theta.\)

The following estimator is called the *Mann-Whitney* estimator (or, the *win proportion*) for the win probability

\[\begin{align*} \hat\theta_N=\frac{1}{n_1n_2}\sum_{i=1}^{n_1}\sum_{j=1}^{n_2}I(X_i<Y_j). \end{align*}\] Here \(N=n_1+n_2\) is the total sample size, whereas \(I(\cdot)\) is the indicator function which takes the value 1 if the underlying inequality is true and 0 otherwise. When \(n_1\rightarrow+\infty,\,n_2\rightarrow+\infty\) then the win proportion is a consistent estimator (convergence in probability) for the win probability \[\begin{align*} \hat\theta_N\longrightarrow\theta. \end{align*}\] If \(\frac{n_1}{N}\rightarrow \alpha,\) as \(n_1\rightarrow+\infty,\,n_2\rightarrow+\infty,\) then the win proportion is also asymptotically normal (Van der Vaart 2000) \[\begin{align*} \sqrt{N}(\hat\theta_N-\theta)\Longrightarrow{\mathcal N}\left(0,\frac{1}{1-\alpha}\sigma_{10}^2+\frac{1}{\alpha}\sigma_{01}^2\right). \end{align*}\] Here, \[\begin{align*} \sigma_{10}^2=Cov(I(\xi<\eta),I(\xi'<\eta))=P(\xi<\eta,\xi'<\eta)-P(\xi<\eta)^2,\\ \sigma_{01}^2=Cov(I(\xi<\eta),I(\xi<\eta'))=P(\xi<\eta,\xi<\eta')-P(\xi<\eta)^2, \end{align*}\] where \(\xi'\) has the same distribution as \(\xi\), \(\eta'\) has the same distribution as \(\eta\). All \(\xi,\xi',\eta,\eta'\) are independent.

### Application to exponential distributions

Suppose now that \(\xi\sim{\mathbb E}(\lambda)\) and \(\eta\sim{\mathbb E}(\mu).\) In this case,

\[\begin{align*} \sigma_{10}^2=\frac{2\lambda}{(\lambda+\mu)(2\lambda+\mu)}-\frac{\lambda^2}{(\lambda+\mu)^2}=\frac{\lambda^2\mu}{(\lambda+\mu)^2(2\lambda+\mu)}\\ \sigma_{01}^2=\frac{\lambda}{\lambda+2\mu}-\frac{\lambda^2}{(\lambda+\mu)^2}=\frac{\lambda\mu^2}{(\lambda+\mu)^2(\lambda+2\mu)}, \end{align*}\] therefore, the asymptotic variance of the win proportion will be \[\begin{align*} \sigma^2=\frac{\lambda\mu}{(\lambda+\mu)^2}\left(\frac{1}{1-\alpha}\frac{\lambda}{2\lambda+\mu}+\frac{1}{\alpha}\frac{\mu}{\lambda+2\mu}\right). \end{align*}\]

We can check this by the following simulations

```
n1 <- 700
n2 <- 100
N <- n1 + n2
m <- 1000
lambda <- 2
mu <- 10
alpha <- n1/(n1+n2)
k <- lambda/(lambda+mu)
WR <- NULL
for(i in 1:m){
set.seed(i)
X1 <- rexp(n1, lambda)
X2 <- rexp(n2, mu)
d <- expand.grid(x = X1, y = X2)
d$w <- ifelse(d$y > d$x, 1, ifelse(d$y == d$x, 0.5, 0))
WR[i] <- sqrt(N)*(mean(d$w) - k)
}
x0 <- 3
int <- seq(-x0, x0, 0.001)
Coeff0 <- mu*lambda/(lambda + mu)^2
Coeff <- Coeff0*(1/(1-alpha)*lambda/(2*lambda+mu) + 1/alpha*mu/(lambda + 2*mu))
hist(WR, nclass = 20, freq = FALSE, xlim = c(-x0, x0),
ylim = c(0, 1.1), col = "lightblue", border = "blue")
lines(int, dnorm(int, mean = 0, sd = sqrt(Coeff)), col = "2")
```

### Definition of the win odds

Consider two independent, continuous RVs (random variables) \(\xi\) and \(\eta\). The odds of the win probability is called the *win odds* of RV \(\eta\) against the RV \(\xi\)

\[\begin{align*} \omega=\frac{P(\eta>\xi)}{P(\eta<\xi)}=\frac{\theta}{1-\theta}. \end{align*}\]

The Mann-Whitney estimate of the win probability can be transformed by the function \(f(x)=\frac{x}{1-x},\ \ x\in(0,1)\) to get an estimate for the win odds. Using the same transformation and the asymptotic normality of the Mann-Whitney estimate it is possible to construct asymptotic confidence intervals for the win odds, for a given asymptotic confidence level.

If \(\omega=1\) then the random variables \(\xi\) and \(\eta\) are stochastically equivalent, while \(\omega>1\) means that \(\eta\) is stochastically greater than (wins against) \(\eta.\) The case \(\omega<1\) means that \(\eta\) loses against \(\xi\). The asymptotic confidence interval of the win odds can be used to test the hypothesis whether \(\omega=1\).

To use the asymptotic normality result described above we need to estimate the asymptotic variance. The package *sanon* in *R* allows to estimate the asymptotic standard error of the Mann-Whitney estimate.

## Ordinal random variables

## The package sanon

We will be using the package **sanon** (Kawaguchi and Koch 2015)

```
#install.packages("sanon")
library(sanon)
```

The dataset *resp* contains data from a randomized clinical trial to compare a test treatment to placebo for a respiratory disorder.

```
data(resp, package = "sanon")
head(resp)
```

```
## center treatment sex age baseline visit1 visit2 visit3 visit4
## 1 1 A F 32 1 2 2 4 2
## 2 2 A F 37 1 3 4 4 4
## 3 1 A F 47 2 2 3 4 4
## 4 2 A F 39 2 3 4 4 4
## 5 1 A M 11 4 4 4 4 2
## 6 2 A F 60 4 4 3 3 4
```

The column *visit4* is a numeric vector for patient global ratings of symptom control according to 5 categories (4 = excellent, 3 = good, 2 = fair, 1 = poor, 0 = terrible), measured at visit 4. To compare the effect of active treatment against the placebo we will use the win probability, which, as we defined previously, is an unknown theoretical quantity. The null hypothesis is that there is no treatment difference which can be written in terms of the win probability as \(\theta=0.5.\) The Mann-Whitney estimate of the win probability can be calculated as follows

```
fit <- sanon(visit4 ~ grp(treatment, ref="P"), data = resp)
fit
```

```
## Call:
## sanon.formula(formula = visit4 ~ grp(treatment, ref = "P"), data = resp)
##
## Sample size: 111
##
## Response levels:
## [visit4; 5 levels] (lower) 0, 1, 2, 3, 4 (higher)
##
## Design Matrix:
## [,1]
## visit4 1
##
## Mann-Whitney Estimate
## for comparison [ A / P ] :
## visit4
## 0.6174
```

`confint(fit)`

```
## M-W Estimate and 95% Confidence Intervals
## :
## Estimate Lower Upper
## visit4 0.6174 0.5173 0.7176
```

`fit$p`

```
## [,1]
## [1,] 0.02150601
```

The p-value based on the asymptotic confidence interval of level 0.05 is less than 0.05, hence the null hypothesis of no treatment difference is rejected. The Mann-Whitney estimate can be transformed to get an estimate for the win odds.

`confint(fit)$ci/(1-confint(fit)$ci)`

```
## Estimate Lower Upper
## visit4 1.614013 1.071762 2.540746
```

For the win odds the null hypothesis of no treatment difference is \(\omega=1.\) The win odds 1.61 characterizes the treatment effect difference.

## References

*Journal of Biopharmaceutical Statistics*, 1–23. https://doi.org/10.1080/10543406.2021.1968893.

*Journal of Statistical Software*67 (9): 1–37. https://doi.org/10.18637/jss.v067.i09.

*Asymptotic Statistics*. Vol. 3. Cambridge university press. https://doi.org/10.1017/CBO9780511802256.