What are effect size, power, and sample size calculation and why do we care?

Last Updated on May 23, 2019 by Ayla Myrick

You may have heard about these three terms and find them confusing when approaching a dissertation or project that requires you to calculate them and interpret them into the project.

What are the three term definitions from the title?

Effect size (ES) is a name given to a group of statistics that measure the magnitude or strength of a treatment or phenomena effect. ES measures are the common metric of meta-analysis studies that summarize the findings from a specific area of research. This tells us how easy it is or difficult it may be to find an effect when doing a research project.

The power of any test of statistical significance is defined as the probability that it will correctly reject a false null hypothesis. The question becomes how much power do you want in doing your test?
Sample size calculation is done to ensure that enough participants or observations are gathered to ensure that the hypothesis testing has enough power to detect and true effect if it is actually present. Sample size calculation, therefore, depends on effect size and power.

How do we calculate sample size?

First, we need to know the default alpha level, the power level expected, the effect size of the phenomena under study and the statistical procedure that will be used to test our hypothesis before calculating sample size. Whew, that is a great deal of things to know. Where do we begin? We start by conducting what is known as a power analysis.

What is a power analysis?

  1. The primary purpose of power analysis is to estimate sample size. First, the researcher must specify the power level they want to achieve. The default power level is usually .80 to .95, depending on your field of interest.
  2. The calculations for power depend on the effect size of the phenomena under study in the population. You can use published experiments similar to the one you will be conducting or a meta-analysis done on your topic of interest as a guide to finding or calculating for yourself the effect size.
  3. Use the default alpha level for your field. In behavioral sciences we use an alpha level of 0.05.
  4. Choose what statistical test you will use to test your hypothesis.
  5. Then, you choose you power level, which can be from .80 to .95, which means you are 80% to 95% sure you have enough power to reject a false null hypothesis and prevent a Type II error. Now that you have done this, what is next?

How do we conduct a sample size calculation?

Once you know your power level, effect size, alpha level and statistical test for the hypothesis, you may use a public domain program known as G*power (Faul, Erdfelder, Lang, & Buchner, 2007). Faul et al. (2007) developed this program at the University of Dusseldorf and have made it available to the public for free. So, G*Power is able to compute power analyses for many different hypothesis tests such as t tests, F tests, χ2 tests, z tests and some exact tests. G*Power can also be used to compute effect sizes and to display graphically the results of power analyses.

Let’s look at an example of how to do this:

So, we decide that we are willing to have a power of .80 and we find from a published meta-analysis that the effect size of the phenomena we are studying is d=.30. The effect size in the case is relatively small. We want to do a one-way ANOVA with three groups (Treatment 1, Treatment 2, and Placebo) on the dependent variable of depression level to test our hypothesis regarding differential effects among two treatments and a placebo.
Now, open up G*power and choose F-tests and then choose ANOVA, fixed effects, one way, omnibus, set power to .80, effect size to .30 and the number of groups to 3. G*power does the calculation and produces two graphics you see below. We found that we need a total sample size of 111 to have enough power (.80) to detect an effect size of .30. Please see Table 1 and Figure 1.

Table 1 F tests – ANOVA: Fixed effects, omnibus, one-way
______________________________________________________________________
Analysis:    A priori: Compute required sample size

Input:         Effect size f                                =   0.30

α err prob                                                      =   0.05

Power (1-β err prob)                                    =   0.80

Number of groups                                       =   3

Output:      Noncentrality parameter λ     =   9.9900000

Critical F                                                         =   3.0803869

Numerator df                                               =   2

Denominator df                                           =   108

Total sample size                                         =   111

Actual power                                                =   0.8034951

______________________________________________________________________
Once you understand what is involved and where to find those values and procedures, this whole idea of sample size estimation turns out to not be so intimidating. See the developer of G*power instructions for use. Good luck in your research endeavors.

Ayla Myrick