Studies have shown that <1% of bacterial species in a given environment are culturable and culture-independent methods like those offered by Second Genome have revealed abundant microbial diversity in unexpected areas, such as cleanrooms and human airways. Our in-depth analysis identifies communities of coexisting microbes that are present in samples related by a criterion such as location, timepoint or treatment. Three commonly used approaches for assessing microbiome data include alpha diversity, beta diversity and taxa-specific comparisons.
Alpha-diversity estimates are methods for describing of the number of types of organisms in a single sample. These measures can also take into account the evenness of taxa in a sample. Alpha-diversity analyses are useful for examining patterns of dominance, rarity and community complexity. Some common measures of alpha-diversity are:
- Richness ("Observed") = the actual number of different taxa observed in a sample
- Chao 1 Index ("Chao1") = the predicted number of taxa in a sample by extrapolating out the number of rare organisms that may have been missed due to undersampling
- Shannon Diversity Index ("Shannon") = the hybrid measure of the richness of a sample and the evenness of taxa in the sample In the figure below 3 different alpha-diversity measures are represented in the panels, with each colored point corresponding to a sample from a different source.
Beta-diversity approaches provide a way for comparing the microbial community composition between two samples. With these methods we are able to simultaneously compare changes in the presence/absence or abundance of thousands of taxa in a microbiome dataset and summarize these into how 'similar' or 'dissimilar' two samples are. Each samples gets compared to every other sample, generating a distance matrix. These distances can be calculated using a variety of different methods. The matrix serves as a key input for visual representations of the data to evaluate the difference among the microbiomes of the samples.
Ordination is the most commonly used visualization approach for observing the dissimilarity among samples. An ordination is a two-dimensional plot where each point on the graph represents the entire microbiome of a single sample. Each axis reflects the percent of variation between the samples with the X-axis representing the greatest dimension of variation and the y-axis representing the second greatest dimension of variation. Samples that cluster together have similar microbial community profiles. If clustering is seen among a set of samples or a study groups a separation of microbiome profiles is clearly detected.
A common ordination method used for microbiome research is the Principle Coordinate Analysis. Below is an example of a Principle Coordinate Analysis Ordination looking at samples taken from different areas of the body. You can see that the samples cluster by sampled location fairly well.
Shifts at the whole microbiome level tell us there is a difference in the microbial profiles. The next question that comes with that is "what taxa are causing the shift?" To answer this question Second Genome looks for differentiated taxa among the comparative sample groups. This approach adjusts for the varying depth of sampling through a normalization step and then applies a statistical model of the noise in the dataset in order to estimate the true signal from each taxon. A parametric statistical test, such as a Wald test, is then applied to determine which taxa are significantly different between groups. Data is typically displayed as a relative fold change in taxa abundance among study groups as shown in the figure below.
In some data sets, a whole microbiome shift will not be clearly identifiable. However often there are still shifts in specific microbiota between study groups. These shifts are subtle and the differences may not be easily viewed on ordinations or other clustering methods that evaluate the whole microbiome. Therefore including these methods for identifying taxa-level differences can provide insight that may be missed in the whole microbiome analyses. These changes can also be grouped at higher phylogentic levels to help identify trends within a family of genus. Displaying these data in a 2-log fold change helps visualize the relative differences between the comparative study groups. In this example you see a significant difference between the microbiomes of skin and feces.