5 Sample size for a binary variable

So far almost everything we’ve covered has related to continuous outcome variables, which we assumed to be normally distributed. This allowed us to use familiar techniques such as the \(t\)-test, and to take baseline information into account in an accessible way (the linear model / ANCOVA). However, very often clinical trials do not have a continuous, normally distributed output, and in the next two sections we will look at two other common possibilities: binary data (this section) and survival data (next section).

A binary outcome might be something like ‘the patient was alive 2 years after the procedure’ or not, or ‘the patient was clear of eczema within a month’ or not. Such variables are often coded as ‘success’ or ‘failure’, or 1 or 0.

For a trial whose primary outcome variables are binary, the sample size calculations we derived in Chapter 2 will not work, so in this section we’ll work through a similar method developed for binary variables.

Suppose we conduct a trial with a binary primary outcome variable and two groups, \(T\) and \(C\), containing \(n_T\) and \(n_C\) participants respectively. The number of successes in each group, \(R_T\) and \(R_C\), will be Binomially distributed,

\[\begin{align*} R_T &\sim{Bi\left(n_T,\, \pi_t\right)} \\ R_C &\sim{Bi\left(n_C,\,\pi_C\right)}. \end{align*}\]

Our null hypothesis now is therefore that \(\pi_T = \pi_C\), ie. that the probability of success is the same in each group, and we will need enough participants to test this hypothesis with sufficient power. With the trial data we will be able to produce estimates

\[\begin{align*} p_T & = \frac{r_T}{n_T} \\ p_C & = \frac{r_C}{n_C}, \end{align*}\]

where \(r_T,\,r_C\) are the obseved values of \(R_T,\,R_C.\)

Recall that the variance of \(p_X\) (where \(X\) is \(T\) or \(C\)) is \(\frac{\pi_X\left(1-\pi_X\right)}{n_X}\), such that the variance depends on the mean. This means there is no free parameter equivalent to \(\sigma\) in the binary situation, and the number of participants required will depend on the approximate values of \(\pi_T\) and \(\pi_C\). This makes the derivation of a sample size formula somewhat more complicated, and so we first of all make a transformation to remove the dependence of mean and variance. To do this we use an approximation technique called the delta method.

5.1 The Delta Method

We start with a random variable \(X\) that has mean \(\mu\) and variance \(\sigma^2 = \sigma^2\left(\mu\right)\), ie. its variance depends on its mean. If we have a ‘well-behaved’ (infinitely differentiable etc.) function \(f\left(X\right)\), what are its mean and variance? To find this exactly requires us to evaluate a sum or integral, and this may be analytically intractable, so we use instead a crude approximation.

First, we expand \(f\left(X\right)\) in a first-order Taylor series about \(\mu\), which gives us

\[\begin{equation} f\left(X\right) \approx f\left(\mu\right) + \left(X-\mu\right)f'\left(\mu\right) \tag{5.1} \end{equation}\]

and therefore

\[\begin{equation} \left(f\left(X\right) - f\left(\mu\right)\right)^2 \approx \left(X-\mu\right)^2\left[f'\left(\mu\right)\right]^2. \tag{5.2} \end{equation}\]

If we take expectations of Equation (5.1) we find \(E\left(f\left(X\right)\right) \approx f\left(\mu\right)\). We can use this in the left-hand side of Equation (5.2) so that when we take expectations of Equation (5.2) we find

\[\begin{equation} \operatorname{var}\left(f\left(X\right)\right) = \sigma^2\left(\mu\right)\left[f'\left(\mu\right)\right]^2, \tag{5.3} \end{equation}\]

where both sides come from

\[\operatorname{var}\left(X\right) = \operatorname{E}\left[\left(X - \mu\right)^2\right] .\] This series of approximations, which generally works well, is the Delta method.

One way in which it is often used, and the way in which we will use it now, is to find a transformation \(f\left(X\right)\) for which (at least approximately) the variance is unrelated to the mean. To do this, we solve the differential equation

\[ \operatorname{var}\left[f\left(X\right)\right] = \sigma^2\left(\mu\right) \left[f'\left(\mu\right)\right]^2 = \text{constant}. \] In the case of proportions for a binary variable, this becomes

\[ \frac{\pi\left(1-\pi\right)}{n} \left[f'\left(\pi\right)\right]^2 = K\] for some constant \(K\). We can rearrange this to

\[f'\left(\pi\right) = \sqrt{\frac{Kn}{\pi\left(1-\pi\right)} } \propto \sqrt{\frac{1}{\pi\left(1-\pi\right)}}.\] So we need

\[\int^\pi \sqrt{\frac{1}{u\left(1-u\right)}}du, \] where the notation indicates that we want the anti-derivative, evaluated at \(\pi\). By substituting \(u = w^2\) we find

\[\begin{align*} f\left(\pi\right) & \propto \int^\pi{\frac{1}{\sqrt{w^2\left(1-w^2\right)}}2w\,dw}\\ &\propto \int{\frac{1}{\sqrt{1 - w^2}}}dw\\ & \propto \arcsin{\left(\sqrt{\pi}\right)}. \end{align*}\]

Setting \(f\left(\pi\right) = \arcsin\left(\sqrt{\pi}\right)\) and using the chain rule, we find

\[\left[f'\left(\pi\right)\right]^2 = \frac{1}{4\pi\left(1-\pi\right)} .\] Finally, we can substitute this into Equation (5.3), with \(f\left(X\right) = \arcsin\left(\sqrt{X}\right)\) to find

\[\begin{align*} \operatorname{var}\left[f\left(\pi\right)\right] & \approx \sigma^2\left(\pi\right)\left[f'\left(\pi\right)\right]^2 \\ & \approx{\frac{\pi\left(1-\pi\right)}{n}\cdot\frac{1}{4\pi\left(1-\pi\right)}}\\ & \approx{\frac{1}{4n}}, \end{align*}\]

and we have achieved our aim of finding a transformation of \(X\) whose variance is not related to the mean. This is sometimes called the angular transformation.

5.2 A sample size formula

For a binary variable, our estimate \(p_X\) (the proportion of successes in group \(X\)) is approximately normally distributed, since the central limit theorem applies. This is not true for small values of \(n\) (less than around 30, which is very small for a clinical trial) or for values of \(\pi\) close to 0 or 1, say \(\pi<0.15\) or \(\pi>0.85\) (this is more likely to be an issue for some trials).

The linear approximation in Equation (5.1) shows us that if \(p_X\) is normally distributed then \(f\left(p_X\right) = \arcsin\left(\sqrt{p_X}\right)\) will be [approximately] normally distributed too. In fact, \(\arcsin\left(\sqrt{p_X}\right)\) is approximately normally distributed with mean \(\arcsin{\left(\sqrt{\pi_X}\right)}\) and variance \(1/\left(4n_X\right)\). Using this information, we can test \(H_0:\,\pi_T =\pi_C\) at the 100\(\alpha\)% confidence level by using the variable

\[ D = \frac{\arcsin{\left(\sqrt{p_T}\right)} - \arcsin{\left(\sqrt{p_C}\right)}}{\sqrt{\frac{1}{4n_T} + \frac{1}{4n_C}}}= \frac{\arcsin{\left(\sqrt{p_T}\right)} - \arcsin{\left(\sqrt{p_C}\right)}}{\frac{1}{2}\lambda\left(n_T,n_C\right)}, \] which is analogous to the variable \(D\) constructed in Section 2.3; the difference in \(f\left(p_T\right)\) and \(f\left(p_C\right)\) divided by the standard error of the difference.

Using the same logic as in Sections 2.4 and 2.5, the starting place for a sample size formula to achieve significance level \(\alpha\) and power \(1-\beta\) is

\[ \frac{2\left(\arcsin{\left(\sqrt{\pi_T}\right)} - \arcsin{\left(\sqrt{\pi_C}\right)}\right)}{\lambda\left(n_T,n_C\right)} = z_\beta + z_{\frac{\alpha}{2}}. \] For two groups of equal size \(N\), this leads us to

\[\begin{equation} N = \frac{\left(z_\beta + z_{\frac{\alpha}{2}}\right)^2}{2\left(\arcsin{\left(\sqrt{\pi_T}\right)} - \arcsin{\left(\sqrt{\pi_C}\right)}\right)^2}. \tag{5.4} \end{equation}\]

Because

\[\arcsin{\left(\sqrt{\pi_T}\right)} - \arcsin{\left(\sqrt{\pi_C}\right)}\]

is not a function of \(\pi_T - \pi_C\), we cannot express this in terms of the difference itself, but instead need to specify the expected probabilities of success in each group. In practice, it is likely that the success rate for the control group \(\left(\pi_C\right)\) is well understood, and the probability for the intervention group \(\left(\pi_T\right)\) can be specified by using the nearest clinically important value of \(\pi_T\).

Example 5.1 (From Smith et al. 1994) This trial compares two approaches to managing malignent low bile duct obstruction: surgical biliary bypass and endoscopic insertion of a stent. The primary outcome variable was ‘Did the patient die within 30d of the procedure?’, and the trial was designed to have \(\alpha=0.05,\;1-\beta=0.95\), which gives \(z_{\frac{\alpha}{2}}=1.96,\,z_{\beta} = 1.65\). The trial wanted to be able to determine a change in 30 day mortality rate from 0.2 to at most 0.05. Plugging these numbers into Equation (5.4)) gives us

\[ N = \frac{\left(1.65 + 1.96\right)^2}{2\left(\arcsin{\left(\sqrt{0.2}\right)} - \arcsin{\left(\sqrt{0.05}\right)}\right)^2} = 114.9, \] and so each group in our trial should contain 115 patients.

If instead our aim had been to detect a change from around 0.5 to 0.35 (the same in terms of \(\pi_A - \pi_B\)), we would instead have needed

\[ N = \frac{\left(1.65 + 1.96\right)^2}{2\left(\arcsin{\left(\sqrt{0.5}\right)} - \arcsin{\left(\sqrt{0.35}\right)}\right)^2} = 280.8 ,\] that is 281 patients per trial arm.

References

Smith, AC, JF Dowsett, RCG Russell, ARW Hatfield, and PB Cotton. 1994. “Randomised Trial of Endoscopic Steriting Versus Surgical Bypass in Malignant Low Bileduct Obstruction.” The Lancet 344 (8938): 1655–60.