back to glossary >>

Power calculation

Before starting a study you should estimate the size of the sample that you need in order to have high enough power to be able to detect a clinically significant effect. This process is often called a power calculation

One of the most common questions asked of a statistician about study design is the number of patients needed. It is an important question, because if a study is too small it will not be able to answer the question posed, and would be a waste of time and money. It could also be deemed unethical because patients may be put at risk with no apparent benefit. However, studies should not be too large because resources would be wasted if fewer patients would have sufficed.

The sample size for any outcome depends on four critical quantities (which need to be known before a sample size calculation can be done):

In a trial the effect size is the amount by which we would expect the two treatments to differ, or is the minimum difference that would be clinically worthwhile.

Usually, α and β are fixed at 5% and 10% respectively (a β error rate set at 10% will give 90% power). A simple formula for a two group parallel trial with a continuous outcome is that the required sample size per group is given by n = 16 σ2 / d 2 for a two sided α of 5% and β of 20 %.

For example, in a trial to reduce blood pressure, if a clinically worthwhile effect for diastolic blood pressure is 5mmHg and the between subjects standard deviation is 10mmHg, we would require n = 16*100/25 = 64 patients per group in the study.

For a binary outcome we need to specify α and β (as described above), and proportions P1 and P2 where P1 is the expected outcome under the control intervention and P1 - P2 is the minimum clinical difference which it is worthwhile detecting. A simple formula for binary data is given by n = 8*(P1 (1 - P1) + (P2 (1 - P2)) / (P1 - P2)2. Thus suppose that in a clinical trial the standard therapy resulted in an improvement for 35% of patients and we would like the new therapy to result in an improvement for 45% of patients. Expressing these percentages as proportions and using the above formula we would require

8 * (0.45*0.55 +0.35*0.65)/(0.10*0.10) = 380 subjects per group to have an 80% chance (a power of 80%) of detecting the specified difference at 5% significance.

The sample size increases as the square of the standard deviation of the data (the variance) and goes down as the square of the effect size. Doubling the effect size reduces the sample size by a factor of 4 - it is much easier to detect a large difference than a small difference. This is analogous to looking at an object on the horizon. If it was a large object it would be possible to see it with the naked eye (i.e. not very powerful equipment); however as the object gets smaller and smaller you will need more and more powerful equipment to see it; and so it is with study size - as the difference to be detected increases the number need to detect it get smaller.

In practice the sample size is often fixed by other criteria, such as finance or resources, and the formula is used to determine a realistic effect size. If this is too large, then the study will have to be abandoned or increased in size.

Five key questions regarding sample size:

  1. What is the main purpose of the study?
  2. What is the principal measure of patient outcome?
  3. How will the data be analysed to detect a treatment difference?
  4. What type of results does one anticipate with standard treatment?
  5. How small a treatment difference is it important to detect and with what degree of certainty?

Thus in order to calculate the sample size for a study it is first necessary to decide upon what your outcome is. If your outcome variable is continuous you will need to have some measure of what you would expect it to be in the control group together with an estimate of its standard deviation. You will also need to know what size of effect you expect or is desirable (be realistic with this). If your outcome variable is binary you will need to have an idea of the proportions falling into the two outcome categories, and what change in these proportions can be expected or is desirable

After deciding on the purpose of the study and the principle outcome measure, the investigator must decide how the data are to be summarised and analysed to detect a treatment difference. Thus, the investigator must choose an appropriate summary measure of this outcome and then calculate a sample size based on the smallest treatment difference in this summary measure that is of such clinical value that it would be very undesirable to fail to detect. Given answers to all of the five questions above, we can then calculate a sample size.

N.b.: would not expect a non-statistician to answer question 3: but they should be able to answer the other 4  questions