Prelude to a crossover, part deux

Prelude to a crossover, part deux

The anatomy of a washout, for better or worse.


In blue represents the baseline data.  On the left are the subjects and their body weight prior to randomization.  At baseline in phase I, we can see that the randomization wasn’t perfect, but that doesn’t really matter so much because this is a CROSSOVER study.  Note the group who is assigned to receive active drug first weighs slightly less than those assigned to placebo (98 vs. 102 kg).

The drug causes a 10 kg weight loss and there is no relevant placebo effect.

After a treatment-appropriate washout period, we are back again at baseline but this time for phase II.  Note the body weight of subjects 1-3 at the end of phase I (89, 88, and 87 kg) has returned to normal.  Now subjects 4-6 get the active treatment and experience a similar outcome.  The final summary appears in the column on the right: even though randomization at baseline was imperfect, the differences were crushed by the superiority of the crossover design, and we see the true drug effect regardless of whether we are comparing drug to baseline OR drug to placebo.  Voila, Mucho gusto, and Kudos


Take II.

Everything from baseline until the end of phase I is identical to the above example.  BUT the washout period is inadequate and the group who received active drug during phase I (subjects 1-3) has not returned to baseline and thus exhibits treatment-specific spillover effects.  Subjects 1-3 are at an artificially lower body weight for the baseline measurements of phase II, so the total baseline data are reduced (97.5 kg vs. 100 kg).  Now we get a different answer if we compare drug to baseline or drug to placebo.  This example illustrates one small error, but it is grievous.  Larger errors are made, and they are worse.  at one end of the spectrum, livelihoods and intellectual progress depend on the accuracy of these data.  be prescribed a sub-optimal medication, prescribe a wrong medication, waste time, etc., etc.  failing to account for a particular confounding variable and carelessly (or otherwise) using an improper statistical technique are two very different errs.  (end soapbox diatribe).



calories proper


Be Sociable, Share!
  • J.Ham

    Hi Bill,

    I don’t quite understand your treatment of the crossover study because it seems to me that randomization does matter. If the groups are different from the start then you’d be comparing the drug in two different circumstances which may give different results. In your drug example it may be that the drug works for people under 100kg (subjects 1-3) but not for people over 100kg (subjects 4-6).


    • Hi J,
      In a crossover study, everyone gets both treatments so randomization only applies to which treatment they get first. This usually works well as long as there’s an adequate washout period. The example you gave is an important one but would require a subgroup analysis.

      • J.Ham

        Right, but why randomize the sequence if randomization doesn’t matter? Why not just give half of the subjects the drug/placebo sequence and the other half the placebo/drug sequence without randomization?

        • For example, we wouldn’t want this situation:

          100kg people get placebo in summer & drug in winter.

          Gotta randomize. More here:

          • J.Ham

            Bill, I think you are talking about the sequences. The different sequences are used mainly to account for the “order effect”. However, since the baseline characteristics do not matter the sequences can be allocated non-randomly so that half get the AB sequence and the other half get the BA sequence. What is the difference between allocating the sequences randomly and non-randomly? The only purpose I can see of randomly assigning sequences is to ensure that the baseline characteristics of both sequence groups are similar. If one asserts that the baseline characteristics are not important then randomly allocating sequences seems pointless.