Types of Microbiome Study Design
A few common study designs are generally leveraged to address a number of different aims, with a strong emphasis on including enough samples to provide sufficient statistical power to generate a confidence in the results and data interpretation.
A typical class comparison study aims to identify the microbial taxa that change abundance in samples representing two or more groups, such as different treatments, locations, or time-points. For example, a class comparison study aim may be to identify the microbial taxa that are more or less abundant in the stomachs of patients treated with a drug. Samples are chosen that representing groups with and without treatment and the microbial profiles are analyzed for taxa that change in response to treatment.
Class prediction studies aim to assign an unknown sample into two or more classes based on their microbial community profiles. For example, the microbial profile of samples from fecal contaminated water can be compared to the microbial profiles of samples from local cow, avian, and human sewage to determine the source. Class prediction is also useful in the development of microbial diagnostic or prognostic signatures of disease.
Class discovery studies aim to uncover subsets of samples with distinct microbial profiles that may indicate a change in an environmental or human factor. For example, the aim of a class discovery study may be to identify a subgroup of gastrointestinal disease patients with a different underlying etiology. Another example is the assessment of multiple agricultural fields to identify microbial diversity impacting crop yield. Class discovery studies also include those that aim to identify a core set of microbes present across multiple groups.
Statistical Power for Microbiome Study
Factors that affect power and sample size calculations include biological variability in the sample set, the magnitude of the microbial differences, and the statistical significance criterion used in the test. Classically, a large enough study size allows rejection of the starting assumption that measurements between sample groups are identical. In general, the smaller the size of the effect (i.e., the measurement of a difference) and the greater the variability within a group or type of specimen, the larger the specimen number required to adequately power the study. Substantial biological variability exists in most microbial systems, including human and animal sources, biological digestion and other systems. With this many data points, some similarities and differences between specimens will occur due to chance. Therefore, for a study to be adequately powered, a large enough sample number is needed to obviate the effects due to chance, overcome the inherent biological diversity, and reveal the true changes, similarities, and differences that are characteristic of the different types of specimens.
As a starting point, a minimum of five biological replicates per group should be analyzed to identify an emerging trend. Preliminary or pilot data generated using five specimens per study group frequently indicate more sources of variability within a group than anticipated. As more samples are added, variability is overtaken by microbial population characteristics and statistical significance is achieved. Utilizing preliminary data from a pilot study often provides valuable information regarding the variability that will influence data and results interpretation, this in turn is used to carry out power analysis to inform the larger study. The rule of thumb is that more specimens generally provide more power.
It can be tempting to consider pooling specimens to reduce experimental costs, compensate for insufficient sample material, or to reduce sample variation. However, sample pooling results in the loss of low biomass information that may be especially important to studies that make inferences about a microbial population. Whenever possible investigators should use non-pooled specimens.
Technical replication allows for processing and measurement variability to be estimated. In general, technical replicate analysis is unnecessary when microbial analysis is conducted in a laboratory compliant with Good Laboratory Practices and Quality Management Systems running quality control and quality assurance programs.