B Computer Practical 2 - Allocation & Analysis

This computer practical will cover two main parts:

  1. Allocation (Section B.1)
  2. Analysis (Section (B.2)

Health warning!

This practical, particularly the allocation section, is designed to help you investigate the behaviour of the various methods we’ve been looking at. There are lots of things we do here that you couldn’t do in a real trial. For example, you wouldn’t be able to plot the baseline covariates, since you wouldn’t have them all at the start of the allocation process. You also wouldn’t be re-do the allocation lots of times, because in most cases the participants join the trial (and are allocated) sequentially.

Preliminaries

The idea is that you spend roughly half the time on each part, but there is probably more content for allocation. It might be sensible to work through some of the allocation exercises, jump ahead to analysis after a while, then go back to allocation if you have time.

You will need a lot of the techniques covered in this practical for your summative assignments, so consider it a sort of informal formative assignment to finish it if you don’t in class. Or, at the very least, you might need to return to it while working on your summative assignments.

There will be a mixture of full solutions, examples of possible solutions and example code to adapt. If you’re not sure how to do something, please ask!

In the example code, I have used the same names for most objects. In order to store your results and avoid confusion, it will be sensible to name things intelligently! For example, suffix each allocation data frame so that you know which allocation method you used. Create an R file with the commands in that you use, so that you can easily replicate your work.

R practicalities

There are many, many packages in R that implement methods for designing and analysing clinical trials (see a list at CRAN task view). We will look at some of these, and will also write our own code for some tasks. Remember that to install a package, you can do

install.packages("<packagename>")

If you have problems running R on your laptop, or on the university machines, the most foolproof way might be to use Github codespaces (thanks to Louis Aslett, who developed this for Data Science and Statistical Computing II). You may be familiar with this approach if you did Bayesian Computational Modelling III.

An advantage of this is that you can open the same codespace (the same instance of R) from any computer, so if you plan to work on things (for example your summative assignment, which will involve some R) from more than one computer, this might be ideal.

This requires you to have a github account (you can sign up for free here) and there is a short guide to creating a github account here.

Direct link to codespace

Instructions for how to use codespace

B.1 Allocation

We’ll make use of several packages in this section. Installing them all now should hopefully prevent it from disrupting your flow! We’ll load them as we go along.

Notice that where there are code snippets like this one that you want to copy directly, you can hover your cursor in the top right of the code box and a ‘copy’ icon will appear.

install.packages(c("medicaldata", "ggplot2", "gtsummary", "Minirand", 
                   "blockrand", "dplyr", "randomizeR"))

B.1.1 Licorice gargle dataset

To work with allocation, we will use the licorice_gargle dataset from the package medicaldata, which you can find by

library(medicaldata)
data("licorice_gargle")

You can find out about the dataset by looking at the help file

?licorice_gargle

and at this website, and if you’re feeling really keen you can even read the original paper: Ruetzler et al. (2013) (but please don’t do any of that now!).

Exercise B.1 Of the 19 columns, how many are baseline characteristics that we could use in the allocation?

Click for solution

Solution. In this dataset the baseline characteristics are helpfully prefixed by preOp, so there are seven baseline characteristics.

## [1] "preOp_gender"     "preOp_asa"        "preOp_calcBMI"    "preOp_age"       
## [5] "preOp_mallampati" "preOp_smoking"    "preOp_pain"

In order to be able to work with the dataset, we need to convert several of the columns to factor variables (you can see by looking at the structure str(licorice_gargle) that all columns are numeric to begin with), so run the following code to convert the necessary columns to factors, and to restrict the dataset to only the columns we’re interested in.

lic_garg = licorice_gargle[ ,1:8]
# vector of names of columns to be coerced to factor
cols <- c("preOp_gender", "preOp_asa",  
"preOp_mallampati", "preOp_smoking", "preOp_pain", "treat")
# convert each of those columns to factors
lic_garg[cols] <- lapply(lic_garg[cols], factor) 

# Check the result:
str(lic_garg)
## 'data.frame':    235 obs. of  8 variables:
##  $ preOp_gender    : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ preOp_asa       : Factor w/ 3 levels "1","2","3": 3 2 2 2 1 2 3 2 3 3 ...
##  $ preOp_calcBMI   : num  33 23.7 26.8 28.4 30.4 ...
##  $ preOp_age       : num  67 76 58 59 73 61 66 61 83 69 ...
##  $ preOp_mallampati: Factor w/ 4 levels "1","2","3","4": 2 2 2 2 1 3 1 2 1 2 ...
##  $ preOp_smoking   : Factor w/ 3 levels "1","2","3": 1 2 1 1 2 1 1 1 1 3 ...
##  $ preOp_pain      : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ treat           : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...

B.1.1.1 Demographic tables

For every allocation, we’ll want to see a summary table of how the patients are distributed between the groups. In this practical we’ll use the function tbl_summary from gtsummary, but there are plenty of other ways to create nice tables around clinical trials (for example the packages table1 and atable).

Notice that the licorice gargle data contains the allocation used in the trial, so we can create a table summarising the participants in each group.

tbl_summary(lic_garg, by = "treat")
Characteristic 0
N = 117
1
1
N = 118
1
preOp_gender

    0 73 (62%) 69 (58%)
    1 44 (38%) 49 (42%)
preOp_asa

    1 19 (16%) 22 (19%)
    2 67 (57%) 67 (57%)
    3 31 (26%) 29 (25%)
preOp_calcBMI 26.1 (22.4, 28.1) 25.7 (22.7, 28.6)
preOp_age 63 (45, 68) 61 (48, 68)
preOp_mallampati

    1 31 (26%) 39 (33%)
    2 69 (59%) 66 (56%)
    3 16 (14%) 13 (11%)
    4 1 (0.9%) 0 (0%)
preOp_smoking

    1 45 (38%) 45 (38%)
    2 36 (31%) 36 (31%)
    3 36 (31%) 37 (31%)
preOp_pain

    0 115 (98%) 118 (100%)
    1 2 (1.7%) 0 (0%)
1 n (%); Median (Q1, Q3)

Notice that we can also save this as an object and access the individual sub-tables, for example

rb_tab = randbalance(
  trt = lic_garg$treat, 
  covmat = lic_garg[,-8], 
  ntrt=2, 
  trtseq = c("0", "1"))
rb_tab$preOp_gender
##    
## trt  0  1
##   0 73 44
##   1 69 49

There are also packages with functions to output these demographic tables formatted for use in latex documents, for example atable.

B.1.1.2 Binning continuous variables

Before we can use any of our allocation methods, we’re going to need to bin the two numeric variables preOp_age and preOp_calcBMI. The sort of information we’d be likely to get pre-trial (the point at which we need to decide on the bins) would be

  • The study was open to all adults (aged 18+), but preOp_age was likely to centre around aged 60.
  • preOp_BMI was expected to have mean 26 and standard deviation around 4.5.

Exercise B.2 Bin the two variables preOp_age and preOp_calcBMI to convert them into factor variables BMI and age.

Investigate how the bins you have chosen split the actual data - are you pleased with your choices?

In the solutions we’ll create a new data frame lg_df with the two new factor variables instead of the original two numeric variables. It might be helpful to you if you make sure your objects and columns are named the same (or else you may need to alter code further into the practical)

Click for solution

Solution. These solutions show you one way to create such factor variables, but if you choose different bins that’s fine!

Based on the information given, one reasonable split for preOp_age would be \(<50,\;50-70\) and \(>70\), in which case we could do

lic_garg$age[lic_garg$preOp_age < 50] <- "Under 50"
lic_garg$age[lic_garg$preOp_age >= 50 & lic_garg$preOp_age < 70] <- "50 to 70"
lic_garg$age[lic_garg$preOp_age >= 70] <- "70 plus"
lic_garg$age = factor(lic_garg$age, levels = c("Under 50", "50 to 70", "70 plus"))

We can see how the data fall into these categories:

## 
## Under 50 50 to 70  70 plus 
##       67      122       46

These aren’t equal but we have a decent number in each group.

With the BMI measurement, we know that there are predefined boundaries between categories, so it might be sensible to use those. The boundary between ‘medium’ and ‘high’ is usually given as 25, so since that is near our given mean we could use that as the boundary.

lic_garg$BMI[lic_garg$preOp_calcBMI < 25] <- "medium_or_low"
lic_garg$BMI[lic_garg$preOp_calcBMI >= 25] <- "high"
lic_garg$BMI = factor(lic_garg$BMI, levels = c("medium_or_low", "high"))

Again, we can see how the participants fall into these bins

## 
## medium_or_low          high 
##            99           136

and the split is fairly even.

Finally, we can select the columns we want from lic_garg to create lg_df

lg_df = lic_garg[,c(1,2,5,6,7,9,10,8)]

If you’ve chosen different bins, that’s fine! But it will help you if the column names and data frame names are the same as on this page.

B.1.1.3 A measure of imbalance

It will be useful to have a simple numerical summary of how imbalanced an allocation is. We will define the imbalance as in lectures,

\[D(n) = \lvert N_T\left(n\right) - N_C\left(n\right) \rvert \] This is pretty simple to define in R:

imbalance = function(
    df,   # participant data frame with allocation column included
    alloc # name of allocation column
    ){
  alloc_vec = as.factor(df[ ,names(df)==alloc])
  alloc_lev = levels(alloc_vec) # how the treatment groups are coded
  n1 = nrow(df[df[alloc]==alloc_lev[1],])
  n2 = nrow(df[df[alloc]==alloc_lev[2],])
  abs(n1-n2)
}

Exercise B.3 Use the imbalance function above to find the imbalance in the allocation recorded in the lic_garg dataset.

Click for solution

Solution.

imbalance(df = lic_garg, alloc = "treat")
## [1] 1

B.1.2 Allocation methods

To mimic the way that participants are recruited sequentially in a trial (which is generally the case), the code for each type of allocation will work its way through the participant data frame, even though this might not be the most efficient way to produce the end result. For the more complex methods we’ll use packages. Feel free to write your own code for the simpler methods if you want to!

We’ll start by performing each method once with the whole dataset, and then go on to include baseline variables, and finally we’ll perform a simulation study.

B.1.2.1 Simple random allocation

In simple random allocation, each participant is allocated to one of the two trial arms with equal probability.

srs = function(
    df, # DF should be the participant data frame. 
        # A column 'treat' will be added
    levels = c("0", "1") # Levels of treat factor
){
  n = nrow(df) # number of rows / participants
# Create a new column 'treat'
  df$treat = rep(NA, n)
# work through the rows, randomly allocating patients with probably 1/2
  for (i in 1:n){
    df$treat[i] = sample(levels, size=1, prob = c(0.5, 0.5))
  }
  df$treat = as.factor(df$treat)
  df
}

Exercise B.4 Use the function srs above to allocate the patients in the licorice_gargle dataset to groups T or C.

Generate the balance table and imbalance and comment on them.

Click for solution

Solution. To allocate the patients, use

lg_srs = srs(
  df = lg_df[,-8],
  levels = c("T", "C")
)

And to display the balance table

tbl_summary(lg_srs, by = "treat")
Characteristic C
N = 105
1
T
N = 130
1
preOp_gender

    0 60 (57%) 82 (63%)
    1 45 (43%) 48 (37%)
preOp_asa

    1 22 (21%) 19 (15%)
    2 59 (56%) 75 (58%)
    3 24 (23%) 36 (28%)
preOp_mallampati

    1 31 (30%) 39 (30%)
    2 59 (56%) 76 (58%)
    3 15 (14%) 14 (11%)
    4 0 (0%) 1 (0.8%)
preOp_smoking

    1 42 (40%) 48 (37%)
    2 27 (26%) 45 (35%)
    3 36 (34%) 37 (28%)
preOp_pain

    0 105 (100%) 128 (98%)
    1 0 (0%) 2 (1.5%)
age

    Under 50 33 (31%) 34 (26%)
    50 to 70 54 (51%) 68 (52%)
    70 plus 18 (17%) 28 (22%)
BMI

    medium_or_low 47 (45%) 52 (40%)
    high 58 (55%) 78 (60%)
1 n (%)

To find the imbalance, the command is

imbalance(lg_srs, alloc = "treat")
## [1] 25

Since this uses the sample function, and we didn’t set a seed, you may well have got a different allocation (and therefore imbalance).

B.1.2.2 Randomly permuted blocks

We will do this with the package blockrand.

library(blockrand)

The function for generating RPB designs is also called blockrand. The default for the function blockrand is that it will randomly vary block length within \(\left\lbrace 2,\,4,\,6,\,8 \right\rbrace\).

Exercise B.5 Try playing around with the function blockrand, for example starting with

blockrand(n=100)

Can you generate an allocation for the licorice_gargle data (remember our data frame is now lg_df) and produce the balance table?

Click for solution

Solution. There are 235 rows/participants in lg_df, and to keep consistency we would like the levels of our treatment to be "T" and "C". Therefore we can do

rpb_lg = blockrand(n=235, levels = c("T", "C"))

Notice that this doesn’t have 235 rows: the blockrand function will always finish after a whole block.

Let’s add this to our participant data to create a new data frame lg_rpb:

# create the new data frame, a copy of lg_df
lg_rpb = lg_df  

# Replace the original treat column with the RPB treatment column
# Using only the first 235 allocations
lg_rpb$treat = rpb_lg$treatment[1:235]

Then we can generate the demographic table as before:

tbl_summary(lg_rpb, by = "treat")

and calculate the imbalance

## [1] 1

The package blockrand contains a function plotblockrand, which outputs PDFs of randomization cards, ready to be printed and put into envelopes!

B.1.2.3 Biased coin designs

We can write code for a biased coin design by adapting the srs function above, setting \(p=\frac{2}{3}\) by default as per Efron (1971).

biased_coin = function(
    data,
    levels = c("T", "C"),
    p=2/3
){
  Dn = 0 # starting value of imbalance
  n = nrow(data)
  alloc = rep(NA, n)
  
  for (i in 1:n){
    if (Dn==0){ # equally balanced
      alloc[i] = sample(levels, size=1, prob=c(0.5, 0.5) )
    } else if(Dn<0){ # More allocations to levels[2] up to this point
      alloc[i] = sample(levels, size=1, prob=c(p, 1-p) )
    } else if(Dn>0){ # More allocations to levels[1] up to this point
      alloc[i] = sample(levels, size=1, prob=c(1-p, p) )
    }
    # Compute imbalance at this stage
    alloc_to_n = alloc[1:i]
    Dn = sum(alloc_to_n==levels[1]) - sum(alloc_to_n == levels[2])
  }
  data$treat = as.factor(alloc)
  data
}

Exercise B.6 Use the function biased_coin above to allocate patients to the two groups. Produce the demographic table and calculate the imbalance. Try this for some different values of p.

Click for solution

Solution. To create a biased coin design with \(p=0.9\) (for example), we enter

lg_bc1 = biased_coin(
  data=lg_df[,-8],
  p=0.9
)

and to produce a balance table:

tbl_summary(lg_bc1, by = "treat")

Finally we can find the imbalance:

imbalance(lg_bc1, alloc="treat")
## [1] 1

B.1.2.4 Urn designs

For the urn design, we will use the function udPar from the package randomizeR. We wrote this in lectures as \(UD\left(r,s\right)\), where

  • \(r\) is the number of balls for each treatment group in the urn to begin with
  • \(s\) is the number of balls added per treatment after each allocation

Exercise B.7 Look at the help file for udPar. Which arguments correspond to \(r\) and \(s\) (as we called them in Section 3.2.3?

The function udPar creates a model object, storing the parameters for the particular Urn design. To generate sequences of allocations, we use the function genSeq

Exercise B.8 Create an urn design object using udPar with \(r=3,\;s=1\), set to create enough allocations for the licorice gargle data.

Use the function genSeq to generate a sequence of allocations.

Click for solution

Solution.

library(randomizeR)
ud_31 = udPar(235, 3, 1, c("0", "1"))
seq_ud31 = genSeq(ud_31)

The allocation vector is stored as a row in the matrix M which is a slot in the genSeq object (part of the object). You can access it by

seq_ud31$M[1,]
##   [1] 1 0 1 0 0 0 1 1 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 0 1 0 0 1 0 0 1 0 1 1 0 1 0 1 1 0 0 0
##  [45] 1 0 1 0 1 1 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 1 0 1 1 1 1 1 0 1 1 0 1 0 1 0 1 1 0
##  [89] 1 0 0 0 0 1 0 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 0 1 1 0 0 0 1
## [133] 1 0 0 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0 1 0 1 1 0 1 1 0 0 1 1 1 1 0 1 1 1 0 0 1 1 0 1 1 0 1
## [177] 0 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 0 0 1 1 0 0 1 0
## [221] 1 1 1 1 0 0 1 0 0 1 0 0 0 1 1

The argument r allows you to generate r sequences, in which case the matrix M has r rows.

B.1.3 Stratifying the dataset

The first way we thought about to account for baseline measurements was to use stratified sampling. For this, we split the dataset into a number of strata, within each of which the participants have the same levels of the stratifying factors. For example, in some hypothetical dataset one stratum might contain only women, aged 50-65, with a history of smoking.

Once we have stratified the dataset, we can apply any of the methods above by simply treating the different strata as separate ‘mini-trials’.

It’s not uncommon for some of the baseline data to be missing, in which case the following algorithms won’t work without some adjustment. Two possible approaches in that case would be

  • to impute a ‘dummy’ value for missing data, chosen using some available information about likely values
  • to not involve that covariate in the allocation method (especially if it’s thought not to be likely to be informative).

An obvious problem with applying stratification to this dataset is the number of strata. We have (if you have the same numbers of levels for age and BMI as I do)

\[2\times{3}\times{4}\times{3}\times{2}\times{3}\times{2} = 864 \] strata. Even by collecting levels together and so on, we are not going to get down to a sufficiently small number of strata. Therefore we will choose just a couple of covariates to stratify by. For example, gender and smoking status:

library(dplyr)
# Add an ID variable so that we can keep track of the order of participants
lg_df$ID = 1:nrow(lg_df)
# split the data frame according to levels of factors
strat_gen_sm <- lg_df %>%
  group_split(preOp_gender, preOp_smoking) 

will create a list of data frames, one for each combination of preOp_gender and preOp_smoking. In this case there are six data frames, and for example the first (accessed by strat_gen_sm[[1]] contains all participants with preOp_gender=0 and preOp_smoking=1. You can choose different factors to stratify by if you want to!

Exercise B.9 Split your group using the code above, choosing two factors to stratify by. How many participants are in each stratum?

Click for solution

Solution. We’ll stick with the group as above. To find the numbers of participants in each group we do

group_sizes = sapply(
  1:length(strat_gen_sm),
  function(i){
    nrow(strat_gen_sm[[i]])
  }
)
group_sizes
## [1] 63 47 32 27 25 41

B.1.3.1 Another measure of imbalance

To see how well balanced our allocations are in terms of each covariate, we will define a function summing the marginal imbalance

marg_imbalance = function(
    df,  # participant data frame, including allocation and all factor variables
    alloc, # name of allocation column
    factors # names of prognostic factors to be included
    ){
  df = as.data.frame(df) # deals with tibbles
  n_fact = length(factors) # the numbers of factors
  imb_sum=0                # a running total of imbalance
  for (i in 1:n_fact){     # loop through the factors 
    ind_i = (1:ncol(df))[names(df)==factors[i]]
    col_i = as.factor(df[ ,ind_i])
    levels_i = levels(col_i)
    nlevels_i = length(levels_i)
    for (j in 1:nlevels_i){ # loop through the levels of factor i
      # df_ij contains just those entries with level j of factor i
      df_ij = df[df[ ,ind_i]==levels_i[j] , ] 
      imb_ij = imbalance(df=df_ij, alloc=alloc) # find the imbalance for the sub-data-frame in which factor i has level j
      imb_sum = imb_sum + imb_ij
    }
  }
  imb_sum
}

For example, to find the marginal imbalance over the gender and age factors, use

marg_imbalance(df=lg_df, alloc="treat", factors = c("preOp_gender", "age"))
## [1] 20

Note that the larger the total number of factor levels, the larger the marginal imbalance will be, so if you’re comparing between methods, make sure you’re including all the same factors!

Exercise B.10 Choose a couple of methods from Section B.1.2, and use them with your stratified dataset. Use the marg_imbalance function to find the marginal imbalance. Try this for just the factors you stratified by, and for other collections of factors. What do you expect to see?

Click for solution

Solution. The code for this will vary, but the basic idea is to work through the individual data sets individually, and apply the functions from Section B.1.2. One way to do this is by creating a for loop. The code below shows how this would work for simple random sampling, but you can change the function to use whichever method you prefer.

# This command creates an empty list, which we will fill with allocation data frames as we go through
alloc_list = list()
# The loop works through the stratified data frames, applies SRS to allocate patients
# and stores them in alloc_list
for (i in 1:length(strat_gen_sm)){
  alloc_list[[i]] = srs(strat_gen_sm[[i]])
}
# bind all the data frames back together again
alloc_full= dplyr::bind_rows(alloc_list)
# re-order according to ID variable
alloc_full[order(alloc_full$ID),]
## # A tibble: 235 × 9
##    preOp_gender preOp_asa preOp_mallampati preOp_smoking preOp_pain age      BMI   treat    ID
##    <fct>        <fct>     <fct>            <fct>         <fct>      <fct>    <fct> <fct> <int>
##  1 0            3         2                1             0          50 to 70 high  0         1
##  2 0            2         2                2             0          70 plus  medi… 0         2
##  3 0            2         2                1             0          50 to 70 high  1         3
##  4 0            2         2                1             0          50 to 70 high  1         4
##  5 0            1         1                2             0          70 plus  high  1         5
##  6 0            2         3                1             0          50 to 70 high  0         6
##  7 0            3         1                1             0          50 to 70 high  0         7
##  8 0            2         2                1             0          50 to 70 high  0         8
##  9 0            3         1                1             0          70 plus  medi… 0         9
## 10 0            3         2                3             0          50 to 70 high  1        10
## # ℹ 225 more rows

It would be silly though to do this with SRS - why?

Once you’ve performed the allocation, you can find the demographic tables, imbalance and marginal imbalance as before (with alloc_full, or whatever your is called, as the data frame)

B.1.4 Minimisation

If we want to try to achieve balance for all prognostic factors, minimisation is a more suitable method. The function Minirand in the package Minirand implements the minimisation algorithm.

Much like we did in lectures (and like we would in a real trial), the function Minirand works from the point of view of having already allocated \(j-1\) particpants, and being presented with a \(j^{th}\).

This example code is copied from the function’s help file, but with some extra comments to help you to follow it. It first creates a participant data frame, then allocates the particpants to treatment groups using Minimisation.

## Information about the treatment
ntrt <- 3 # There will three treatment groups
trtseq <- c(1, 2, 3) # the treatment groups are indexed 1, 2, 3
ratio <- c(2, 2, 1)  # the treatment groups will be allocated in a 2:2:1 ratio

## The next few rows generate the participant data frame
nsample <- 120 # we will have 120 participants
c1 <- sample(seq(1, 0), nsample, replace = TRUE, prob = c(0.4, 0.6)) 
c2 <- sample(seq(1, 0), nsample, replace = TRUE, prob = c(0.3, 0.7))
c3 <- sample(c(2, 1, 0), nsample, replace = TRUE, prob = c(0.33, 0.2, 0.5)) 
c4 <- sample(seq(1, 0), nsample, replace = TRUE, prob = c(0.33, 0.67)) 
covmat <- cbind(c1, c2, c3, c4) # generate the matrix of covariate factors for the subjects
# label of the covariates 
colnames(covmat) = c("Gender", "Age", "Hypertension", "Use of Antibiotics") 
covwt <- c(1/4, 1/4, 1/4, 1/4) # equal weights/importance applied to each factor

## Applying the algorithm - start here if you already have participant data!

res <- rep(NA, nsample) # Generate a vector to store the results (the allocations)
# generate treatment assignment for the 1st subject
res[1] = sample(trtseq, 1, replace = TRUE, prob = ratio/sum(ratio)) 
# work through the remaining patients sequentially
for (j in 2:nsample)
  {
  # get treatment assignment sequentially for all subjects
  # The vector res is updated and so all previous allocations are accounted for
  # covmat is the data frame of participant data
    res[j] <- Minirand(
      covmat=covmat, j, covwt=covwt, ratio=ratio, ntrt=ntrt, trtseq=trtseq, method="Range", result=res, p = 0.9
      )
}
## Store the allocation vector 'res' as 'trt1'
trt1 <- res

# Display the number of randomized subjects at covariate factors
balance1 <- randbalance(trt1, covmat, ntrt, trtseq) 
balance1
## $Gender
##    
## trt  0  1
##   1 24 23
##   2 25 23
##   3 14 11
## 
## $Age
##    
## trt  0  1
##   1 34 13
##   2 34 14
##   3 17  8
## 
## $Hypertension
##    
## trt  0  1  2
##   1 25  9 13
##   2 27  9 12
##   3 14  5  6
## 
## $`Use of Antibiotics`
##    
## trt  0  1
##   1 32 15
##   2 33 15
##   3 16  9
# Calculate the total imbalance of the allocation
totimbal(trt = trt1, covmat = covmat, covwt = covwt, 
ratio = ratio, ntrt = ntrt, trtseq = trtseq, method = "Range")
## [1] 2.125

Exercise B.11 Adapt the code above to apply the minimisation algorithm to the licorice gargle data. Investigate how balanced the data set is.

Click for solution

Solution. Notice in the code above that we only need to start about half way down if we already have a dataset (which we do).

nsample = nrow(lg_df)
res = rep(NA, nsample)
res[1] = sample(c(0,1), 1, replace = TRUE, prob = c(0.5,0.5)) 
# work through the remaining patients sequentially
for (j in 2:nsample){
  # get treatment assignment sequentially for all subjects
  # The vector res is updated and so all previous allocations are accounted for
  # covmat is the data frame of participant data - including only the covariates
    res[j] <- Minirand(
      covmat=lg_df[ ,1:7], j, covwt=rep(1,7)/7, ratio=c(1,1), ntrt=2, trtseq=c(0,1), method="Range", result=res, p = 0.9
      )
}

lg_df$treat = res

## You can now investigate the balance of the design as usual

B.2 Analysis

For our analysis section, we will work with real datasets, and use them to explore and compare some of the different analysis methods we’ve covered in lectures. As we haven’t quite finished the analysis section in lectures yet (specifically the section on ANCOVA), you might want to look back at this after our next lecture.

B.2.1 Polyps data

The dataset we’ll use concerns a small trial in which patients with familial adenomatous polyposis (FAP) were treated either with Sulindac (group T) or a placebo (group C). People with FAP tend to have polyps (small growths) grow in their colon. Although the polyps themselves are not dangerous or harmful, they can turn into colon cancer. You can read about the study in Giardiello et al. (1993).

Exercise B.12 First of all let’s explore the dataset:

  • Look at the help file for polyps
  • Which columns should you use for baseline and outcome variables?
  • Plot the data in these columns. Do you see any issues? Can you suggest a way to address them? What effect will this have on our estimator of the treatment effect?
Click for solution

Solution. First of all, to access the data, enter

data(polyps, package = "medicaldata")

You can then view the help file by entering ?polyps.

To investigate the columns in polyps we can use the structure function str(polyps)

We have the baseline number of polyps for each patient in the column baseline, and a sensible column to use for the outcome would be number12m. Other baseline variables are sex and age.

To plot the baseline and outcome variables we can do

ggplot(data=polyps, aes(x=baseline)) + geom_histogram()

ggplot(data=polyps, aes(x=number12m)) + geom_histogram()

These are both very right skewed, and exclusively positive (which makes sense given they are counts). A sensible thing to do therefore would be to take the log of these two variables. Since they are counts we will use the log to base 10, so that our results are easier to interpret. If you’d prefer to use natural logs that’s fine, some of your numbers will be different.

polyps$log_baseline = log10(polyps$baseline)
polyps$log_number12m = log10(polyps$number12m)

Since we used the logarithms of the numbers of polyps, our treatment effect is \[\log\left(X_T\right) - \log\left(X_C\right) = \log\left(\frac{X_T}{X_C}\right).\]

We will work through the tests as we did in lectures, to see how the results differ.

Exercise B.13 Using just the outcome variable (which should be log_number12m or something similar, see solution above if you aren’t sure why), test the hypothesis that the Sulindac has had some effect.

Click for solution

Solution. We can either do this the long way or the short way! For sanity’s purposes, we’ll do it the long way once, then in subsequent questions I’ll only show the short way in the solutions.

The long way

The first thing to do is to calculate means for each group:

polyps_df = na.omit(polyps) # two participants (001 and 018) don't have data for 12m, so remove these
mean_T = mean(polyps_df$log_number12m[polyps_df$treatment=="sulindac"]) 
sd_T = sd(polyps_df$log_number12m[polyps_df$treatment=="sulindac"])
mean_C = mean(polyps_df$log_number12m[polyps_df$treatment=="placebo"])
sd_C = sd(polyps_df$log_number12m[polyps_df$treatment=="placebo"])
# There are 11 patients on Placebo (group C) and 9 on Sulindac (group T)
# The pooled standard deviation 
pooled_sd_polypsX = sqrt((10*sd_C^2 + 8*sd_T^2)/(10+8))
# Finally we find the test statistic
test_stat = (mean_T - mean_C)/(pooled_sd_polypsX*sqrt(1/11 + 1/9))

Under \(H_0\) (that there is no treatment effect) the test statistic follows a \(t_{18}\) distribution, and therefore we find our \(p\)-value by

2*pt(test_stat, df=18)
## [1] 0.001046571

Note that if the test statistic were positive, we’d have to do

2*(1-pt(test_stat, df=18))

Thus \(p=0.001\) and we conclude that the treatment effect is significant. Our confidence interval is given by

estimate = mean_T - mean_C
error = qt(0.975, df=18)*pooled_sd_polypsX*sqrt(1/11 + 1/9)
c(estimate - error, estimate + error)
## [1] -1.2264748 -0.3678718

The short way

R has a t-test function built in, so we can simply use that. Notice that it gives us everything our heart might desire (on a fairly mundane day), including a confidence interval!

t.test(
  x=polyps_df$log_number12m[polyps_df$treatment == "sulindac"],
  y=polyps_df$log_number12m[polyps_df$treatment == "placebo"],
  alternative = "two.sided",
  var.equal=T, # this makes the method use pooled variances, as we did in lectures
  conf.level = 0.95 # note that this is 1-alpha
)
## 
##  Two Sample t-test
## 
## data:  polyps_df$log_number12m[polyps_df$treatment == "sulindac"] and polyps_df$log_number12m[polyps_df$treatment == "placebo"]
## t = -3.9012, df = 18, p-value = 0.001047
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.2264748 -0.3678718
## sample estimates:
## mean of x mean of y 
## 0.6671373 1.4643106

Either way, our 95% confidence interval is (-1.226,;-0.368) (to 3 d.p). Therefore, the confidence interval for \(\frac{X_T}{X_C}\) is

\[\left(10^{-1.226},\;10^{-0.368}\right) = \left(0.0594,\;0.429\right).\] That is, the number of polyps at 12 months for someone in group \(T\) is likely to be somewhere between \(0.0594 X_C\) and \(0.429 X_C\).

We’ll now move on to comparing the differences between baseline and outcome

Exercise B.14 Perform another \(t\)-test, this time using the difference between outcome and baseline. Hint: because we’ve taken logs, you’ll have to think about what variable to use and perhaps experiment with some possibilities

Click for solution

Solution. The first step is to calculate a difference column. This is somewhat complicated by that fact that we have taken logs of the measurements. Taking logs of the difference would not work, since some are negative. Potentially it might work to just work with the (unlogged) differences

polyps_df$diff = polyps_df$number12m - polyps_df$baseline
ggplot(data=polyps_df, aes(x=diff, fill=treatment)) + geom_histogram(position="dodge", bins=10)

But the outliers look potentially problematic, and the central bulk of each distribution doesn’t look very normal. We could also try the difference of the logged measurements:

polyps_df$diff_log = polyps_df$log_number12m - polyps_df$log_baseline
ggplot(data=polyps_df, aes(x=diff_log, fill=treatment)) + geom_histogram(position="dodge", bins=10)

This looks a lot better - no more outliers and closer to normal-looking. Obviously with so few observations we won’t have a nice normal curve.

To do a t-test, we can use R’s in-built function

t.test(
  x=polyps_df$diff_log[polyps_df$treatment == "sulindac"],
  y=polyps_df$diff_log[polyps_df$treatment == "placebo"],
  alternative = "two.sided",
  var.equal=T, # this makes the method use pooled variances, as we did in lectures
  conf.level = 0.95 # note that this is 1-alpha
)
## 
##  Two Sample t-test
## 
## data:  polyps_df$diff_log[polyps_df$treatment == "sulindac"] and polyps_df$diff_log[polyps_df$treatment == "placebo"]
## t = -3.5142, df = 18, p-value = 0.002477
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.0242756 -0.2578053
## sample estimates:
##   mean of x   mean of y 
## -0.58476481  0.05627565

Finally we can try fitting an ANCOVA model to the data, using the function lm.

Exercise B.15 Fit an ANCOVA model to the licorice data, and interpret your results. If you aren’t familiar with the function lm, it might be a good idea to look at the solutions.

Click for solution

Solution. In the ANCOVA model we saw in lectures so far, we fit a linear model with the outcome as dependent / target variable, and the trial group and baseline as independent / explanatory variables.

lm_polyp1 = lm(log_number12m ~ treatment + log_baseline, data=polyps_df)
summary(lm_polyp1)
## 
## Call:
## lm(formula = log_number12m ~ treatment + log_baseline, data = polyps_df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.69628 -0.15530  0.04155  0.17288  0.68553 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         0.6291     0.2798   2.249 0.038084 *  
## treatmentsulindac  -0.7046     0.1675  -4.205 0.000595 ***
## log_baseline        0.5932     0.1824   3.251 0.004698 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3673 on 17 degrees of freedom
## Multiple R-squared:  0.6659, Adjusted R-squared:  0.6266 
## F-statistic: 16.94 on 2 and 17 DF,  p-value: 8.973e-05

Based on this, the effect of the drug sulindac is significant (\(p=0.00595\), lower than our previous models). The 95% CI for the treatment effect (which is still \(\log\left(\frac{X_T}{X_C}\right)\)) now is

c(-0.7046 - qt(0.975, df=17)*0.1675, -0.7046 + qt(0.975, df=17)*0.1675 ) 
## [1] -1.0579941 -0.3512059

We could create a nicer-looking table of the coefficients using tbl_regression (from gtsummary):

Characteristic Beta 95% CI1 p-value
treatment


    placebo
    sulindac -0.70 -1.1, -0.35 <0.001
log_baseline 0.59 0.21, 0.98 0.005
1 CI = Confidence Interval

but note that this doesn’t show all the information that summary does.

Unlike in the \(t\)-test where we can only compare measurements like-for-like, in ANCOVA we fit a coefficient to the baseline covariate. This means we are no longer limited to comparing the outcome variable to variables on the same scale, but can also include other baseline variables.

Exercise B.16 Fit another linear model, this time including the other baseline variables.

Click for solution

Solution.

lm_polyp2 = lm(log_number12m ~ treatment + log_baseline + sex + age, data = polyps_df)
summary(lm_polyp2)
## 
## Call:
## lm(formula = log_number12m ~ treatment + log_baseline + sex + 
##     age, data = polyps_df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.59539 -0.32008  0.09788  0.17374  0.62089 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.888170   0.503801   1.763 0.098266 .  
## treatmentsulindac -0.730464   0.177936  -4.105 0.000936 ***
## log_baseline       0.495560   0.216900   2.285 0.037307 *  
## sexmale            0.197442   0.201555   0.980 0.342821    
## age               -0.009412   0.012205  -0.771 0.452620    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3779 on 15 degrees of freedom
## Multiple R-squared:  0.688,  Adjusted R-squared:  0.6048 
## F-statistic:  8.27 on 4 and 15 DF,  p-value: 0.0009895

In this case it appears that the other two baseline covariates, age and sex, don’t have a significant effect on the outcome.

Exercise B.17 Inspect some plots of the residuals, to check whether these models have any systematic problems. Does there appear to be any multicollinearity in the data?

Click for solution

Solution. These solutions will demonstrate some methods with the first model. First of all, we can add columns with the residuals and fitted values from the model.

polyps_df$resid1 = resid(lm_polyp1)
polyps_df$fitted1 = fitted(lm_polyp1)
ggplot(data=polyps_df, aes(x=fitted1, y=resid1, col=treatment)) + geom_point()

ggplot(data=polyps_df, aes(x=log_baseline, y=resid1, col=treatment)) + geom_point()

ggplot(data=polyps_df, aes(x=resid1, fill=treatment)) + geom_histogram(bins=20)

None of these ring huge alarm bells, but because the dataset is so small it’s quite hard to tell! Arguably the residuals for the sulindac group are more spread out than those for the plaecbo, but there are no obvious systematic trends.

We can find the variance inflation factor by

vif(lm_polyp1)
##    treatment log_baseline 
##     1.029768     1.029768

and it indicates no problems. Another sign that the ANCOVA model is well-fitted is that the standard errors of the coefficients are all reasonably small. If there was a problem, for example multicollinearity, some of these would blow up.

B.3 Extension problems

If you have time to spare, or want to look a bit deeper into things after the practical, these exercises are here for you!

B.3.1 A simulation experiment!

In the allocation section we used each method once and produced summaries like the marginal imbalance. Because they are inherently random, to understand the general behaviour of each method, we’ll conduct a simulation experiment.

The general process for this is:

  1. Choose a numerical summary (probably either the imbalance or the marginal imbalance)

  2. Choose some large number \(n_{sim}\) and create a vector of zeroes or NAs that length

  3. Run through the following process \(n_{sim}\) times:

    • Perform the allocation
    • Find the numerical summary
    • Store the numerical summary in the vector you created in step 2.
  4. Plot (or otherwise summarise) the vector of summaries.

Exercise B.18 Perform the simulation experiment for some of the methods above. Make sure you store the summary vectors by different (and intelligible!) names.

Based on your results, comment on how plausible it is that Ruetzler et al. (2013) used each method in their trial.

Hint: think of the simulated summaries as approximations of probability distributions

For the licorice dataset, Ruetzler et al. (2013) say that “Randomization (1:1) to licorice or placebo was assigned by a Web-based system that was accessed just before treatment by an independent researcher who was not involved in data collection; no stratification was used.”

B.3.2 Treatment for maternal periodontal disease

We looked at the opt dataset in the last practical in the context of missing data, but now we’re going to look at it for analysis. The opt data are in the package medicaldata, which you should already have loaded.

The aim of the study is to find out whether treating women for periodontal disease during the first 21 weeks of pregnancy resulted in a low birth weight or pre-term birth.

The groups are stored in the group variable

  • group T: those who received periodontal treatment, and tooth polishing at their follow-ups
  • group C: Brief oral exams

There are 823 participants and 171 variables - many of these are baseline covariates, but there are also quite a few interim measurements.

The study’s primary outcome variable is gestational age at end of pregnancy, but birth weight was also measured, as well as some clinical measures of periodontal disease and some microbiological and immunological outcomes.

To make this more managable, we will concentrate on the outcome Birthweight.

Exercise B.19 Plot the variable Birthweight. Do we need to use any transformations before we model it?

Click for solution

Solution.

ggplot(data=opt, aes(x=Birthweight, fill=Group)) + geom_histogram(position="dodge")

This data looks very nice and normal (although there is a bit of a fat tail to the left), so we won’t transform it.

You’ll remember that there were lots of issues with the opt data.

We’ll reduce the dataset to a more managable size, by removing all outcomes except Birthweight and reducing the number of covariates. We’ll also impute the variables that we can do logically, eg. BL.Cig.day, as we did in computer practical 1.

If you still have your version of opt after performing regression imputation on the BMI variable you could use that!

opt_red = opt[ ,c(1:22,72)] 
# Change NAs to "None" for diabetic
# since in opt the non-diabetic people are coded as NA (and therefore excluded from the model)
diab = as.character(opt_red$BL.Diab.Type)
diab[is.na(diab)] = "None"
opt_red$BL.Diab.Type = as.factor(diab)
# similar problem with smokers and how many cigarettes per day

# If people are non-smokers and have missing for number of cigarettes per day
# change their number of cigarettes to zero
sm = opt_red$Use.Tob
cigs = opt_red$BL.Cig.Day
cigs[(is.na(cigs)&(sm=="No "))] = 0
opt_red$BL.Cig.Day = cigs

# Same for alcohol and drinks per day

alc = opt_red$Use.Alc
dr = opt_red$BL.Drks.Day
dr[(is.na(dr)&(alc=="No "))] = 0
opt_red$BL.Drks.Day = dr

# If a participant hasn't had a previous pregnancy, her N.prev.preg should be zero (not NA)

pp = opt_red$Prev.preg
npp = opt_red$N.prev.preg
npp[pp=="No "] = 0
opt_red$N.prev.preg = npp

When we use lm to fit an ANCOVA model, all rows with an NA in will be ignored. It’s therefore important to try to eradicate any NAs where we actually do know the value!

Exercise B.20 Perform a \(t\)-test on the outcome Birthweight. What do you find? Can you also perform a \(t\)-test using difference from baseline?

Click for solution

Solution. To perform the \(t\)-test we can use the inbuilt R function

# Check SDs are fairly close before proceeding
sd(opt_red$Birthweight[opt_red$Group == "T"], na.rm=T)
## [1] 636.82
sd(opt_red$Birthweight[opt_red$Group == "C"], na.rm=T)
## [1] 727.4854
t.test(
  x=opt_red$Birthweight[opt_red$Group == "T"],
  y=opt_red$Birthweight[opt_red$Group == "C"],
  alternative = "two.sided",
  var.equal=T, # this makes the method use pooled variances, as we did in lectures
  conf.level = 0.95 # note that this is 1-alpha
)
## 
##  Two Sample t-test
## 
## data:  opt_red$Birthweight[opt_red$Group == "T"] and opt_red$Birthweight[opt_red$Group == "C"]
## t = 0.74585, df = 807, p-value = 0.456
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -58.49266 130.18492
## sample estimates:
## mean of x mean of y 
##  3216.670  3180.824

We find that the difference in Birthweights between the groups is not even close to significant (\(p=0.456\)).

We can’t do a \(t\)-test with differences because there is no comparable baseline measurement.

Now we will move on to ANCOVA.

Exercise B.21 Before fitting the model, think about what you expect to find. Bear in mind:

  • the result above
  • the description of risk factors of preterm birth / low birth weight in the help file for opt
  • which variables you will include in your model
Click for solution

Solution. The \(p\)-value from the \(t\)-test was very high, so it seems unlikely that we’ll find a significant treatment effect using ANCOVA unless there are some significant interactions going on between covariates.

The help file says (under ‘Background’):

Many risk factors for preterm birth have already been identified, including maternal age, drug use, and diabetes. However, such factors are exhibited in only about half of preterm birth mothers, highlighting a need to expand our understanding of what contributes to preterm birth risk.

Therefore, if we include related factors in our model (many of which we have in our dataset) we should expect to see significant coefficients. But, it sounds like there is a lot that isn’t well understood, so our model is likely not to explain a huge proportion of the variance.

Exercise B.22 Fit an ANCOVA model. What do you find?

Click for solution

Solution. To fit a linear model including every variable as a covariate (apart from the target variable), we can do

lm_full = lm(Birthweight ~ ., data = opt_red[ ,-1]) # don't include the ID column!
summary(lm_full)
## 
## Call:
## lm(formula = Birthweight ~ ., data = opt_red[, -1])
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3019.46  -233.35    46.89   382.82  1935.06 
## 
## Coefficients: (1 not defined because of singularities)
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            3906.439    702.835   5.558 3.91e-08 ***
## ClinicMN                -57.521     73.391  -0.784  0.43345    
## ClinicMS               -173.068     88.189  -1.962  0.05011 .  
## ClinicNY               -146.548    100.180  -1.463  0.14397    
## GroupT                   28.780     49.914   0.577  0.56440    
## Age                      -1.686      5.417  -0.311  0.75572    
## BlackYes               -158.055    127.807  -1.237  0.21664    
## WhiteYes                -47.195    100.791  -0.468  0.63976    
## Nat.AmYes              -224.757    112.946  -1.990  0.04699 *  
## AsianYes               -142.693    301.298  -0.474  0.63594    
## HispNo                  133.343     75.097   1.776  0.07624 .  
## HispYes                 126.498    115.009   1.100  0.27177    
## EducationLT 8 yrs       -43.324     70.513  -0.614  0.53915    
## EducationMT 12 yrs       23.011     66.301   0.347  0.72864    
## Public.AsstceYes       -144.623     62.886  -2.300  0.02176 *  
## HypertensionY          -372.548    148.033  -2.517  0.01207 *  
## DiabetesYes             443.811    170.343   2.605  0.00938 ** 
## BL.Diab.TypeType I     -425.019    318.490  -1.334  0.18249    
## BL.Diab.TypeType II          NA         NA      NA       NA    
## BMI                       2.124      3.857   0.551  0.58206    
## Use.TobYes              -17.794    134.088  -0.133  0.89446    
## BL.Cig.Day              -23.101     11.560  -1.998  0.04607 *  
## Use.AlcYes             -159.827    223.653  -0.715  0.47509    
## BL.Drks.Day              28.460     24.851   1.145  0.25252    
## Drug.AddNo             -498.971    659.330  -0.757  0.44944    
## Drug.AddYes            -475.778    806.745  -0.590  0.55555    
## Prev.pregYes             85.355     70.825   1.205  0.22856    
## N.prev.preg             -17.281     18.228  -0.948  0.34345    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 653.6 on 686 degrees of freedom
##   (110 observations deleted due to missingness)
## Multiple R-squared:  0.06774,    Adjusted R-squared:  0.03241 
## F-statistic: 1.917 on 26 and 686 DF,  p-value: 0.004183

We see from the model summary that our model is terrible: \(R^2=0.03241\), which means it is explaining about 3% of the variance in Birthweight.

If you want to use only certain terms, you can include them in the formula, for example

lm_eg = lm(Birthweight ~ Group + Age + Hypertension, data=opt_red)

We see that, as we expected, the Group variable is not significant (\(p=0.5644\)). However, some terms are significant, for example whether or not the participant has diabetes, and how many cigarettes a mother smokes per day - this isn’t surprising given the contextual information we had.

Exercise B.23 Perform some diagnostic checks on your model. Do you have any reason to suspect it isn’t adequate?

Click for solution

Solution. One thing to notice from the linear model summary is that many coefficients have proportionally quite large standard errors. This could be because of a lack of data (eg. if there are very few participants in some category) or because of multicollinearity (in which case the model cannot know which of the linked variables to attribute the effect to). Combining some categories or carefully removing some covariates could help improve the model.

We can also study the residuals of our model, to check that they appear to be homoskedastic and approximately normal with mean 0.

The first step is to extract the residuals and the fitted values (which are also useful). We will create a new data frame called opt_diag with these in, so that we can plot things easily but don’t pollute our original dataset (in case we want to fit any more models)

opt_diag = na.omit(opt_red) # lm only fits where all variables are present
opt_diag$resid = resid(lm_full)
opt_diag$fitted = fitted(lm_full)

Some examples of plots are

ggplot(data=opt_diag, aes(x=resid, fill=Group)) + geom_histogram(position="dodge")

ggplot(data=opt_diag, aes(x=fitted, y=resid, col=Group)) + geom_point()

Exercise B.24 Given the results of your ANCOVA model, what do you think the risks would be if the study had been much smaller, or if the allocation had not been well-balanced?

Click for solution

Solution. In this case, it would be possible for the baseline factors that did turn out to be significant to make it appear that the treatment had a significant effect. This wouldn’t be possible with ANCOVA, but it would with a \(t\)-test in which the other covariates aren’t considered.

References

Efron, Bradley. 1971. “Forcing a Sequential Experiment to Be Balanced.” Biometrika 58 (3): 403–17.
Giardiello, Francis M, Stanley R Hamilton, Anne J Krush, Steven Piantadosi, Linda M Hylind, Paul Celano, Susan V Booker, C Rahj Robinson, and G Johan A Offerhaus. 1993. “Treatment of Colonic and Rectal Adenomas with Sulindac in Familial Adenomatous Polyposis.” New England Journal of Medicine 328 (18): 1313–16.
Ruetzler, Kurt, Michael Fleck, Sabine Nabecker, Kristina Pinter, Gordian Landskron, Andrea Lassnigg, Jing You, and Daniel I Sessler. 2013. “A Randomized, Double-Blind Comparison of Licorice Versus Sugar-Water Gargle for Prevention of Postoperative Sore Throat and Postextubation Coughing.” Anesthesia & Analgesia 117 (3): 614–21.